Implementing State Machines in PostgreSQL (2017)

michelpp 2 months ago

While this is a good approach to implements state machines, the transition function is hardwired to one transition graph. Another approach is to generalize the state and transition graph into a table, and group those transitions by a machine id, as shown here:

https://github.com/michelp/pgfsm

Now many machines (sub-graphs of state transitions) can be defined in general, and the transition checking function checks the validity of the next state based on the table, instead of static rules in a function.

cryptonector 2 months ago

I don't like TFA's TRIGGER using their aggregate function for the FSM because they are not ordering the events by time, but also because you just don't need the complexity of reviewing every event for an order every time there is a new event for the same order if you can just assume that all events arrive in chronological order and therefore the FOR EACH ROW trigger fires in chronological order. And if the events don't arrive (or are inserted) in chronological order then you need a FOR EACH STATEMENT trigger instead of a FOR EACH ROW trigger. Besides even as coded in TFA there is very much an in-built assumption that order_events inserts happen in chronological order anyways, so you just don't need to use the aggregate function in that trigger. The aggregate function is pretty neat and useful though, but every time you use you have to order input events by time, so the query of how many orders are in each possible state on any given day is incorrect.

Also, if you have a column to store the current/newest state of each order then I believe you can use plain SQL for the trigger function instead of PlPgSQL since you can then rely on a CHECK() to raise an exception when the new state would be 'error'.

(I bet TFA is looking for comments like these since as you can see at the bottom of TFA they are hiring.)

smitty1e 2 months ago

TFA says he's using this to

> implement a realtime analytics dashboard for an application with over a billion event rows.

I'm interested in seeing how well this could be attained in SQLite with window functions[1] since SQLite doesn't seem to support a CREATE AGGREGATE.

There would also need to be some python[2] to provide the PL/SQL logic shown.

[1] https://sqlite.org/windowfunctions.html

[2] https://docs.python.org/3/library/sqlite3.html#sqlite3.Conne...

cryptonector 2 months ago

> I'm interested in seeing how well this could be attained in SQLite with window functions[1] since SQLite doesn't seem to support a CREATE AGGREGATE.

SQLite3 very much does support user-defined aggregate functions, just not with a CREATE AGGREGATE statement. You have to use one of the `sqlite3_create_function*()` C functions[0][1][2] to create a user-defined aggregate function:

  The sixth, seventh and eighth parameters
  passed to the three "sqlite3_create_function*"
  functions, xFunc, xStep and xFinal, are pointers
  to C-language functions that implement the SQL
  function or aggregate. A scalar SQL function
  requires an implementation of the xFunc callback
  only; NULL pointers must be passed as the xStep
  and xFinal parameters. An aggregate SQL function
  requires an implementation of xStep and xFinal
  and NULL pointer must be passed for xFunc. To
  delete an existing SQL function or aggregate,
  pass NULL pointers for all three function
  callbacks.
  
  The sixth, seventh, eighth and ninth parameters
  (xStep, xFinal, xValue and xInverse) passed to
  sqlite3_create_window_function are pointers
  to C-language callbacks that implement the
  new function. xStep and xFinal must both be
  non-NULL. xValue and xInverse may either both be
  NULL, in which case a regular aggregate function
  is created, or must both be non-NULL, in which
  case the new function may be used as either an
  aggregate or aggregate window function. More
  details regarding the implementation of aggregate
  window functions are available here.

> There would also need to be some python[2] to provide the PL/SQL logic shown.

Ah, so you do know about the `sqlite3_create_function*()` functions. You don't have to use Python, though you can of course.

  [0] https://sqlite.org/c3ref/create_function.html
  [1] https://sqlite.org/appfunc.html#the_aggregate_function_callbacks
  [2] https://sqlite.org/windowfunctions.html#udfwinfunc*

smitty1e 2 months ago

Hmmm...so a python generator might be just the thing for that aggregate...

p0w3n3d 2 months ago

Can anyone give an example where state machine isn't an anti pattern?

I mean I never met a system that had a graph of states that wouldn't require working across this graph in certain situations. Hardwiring it to database would even worsen handling such edge cases.