Event-driven system and subscribers missing events

Let's say I have a service that publishes events, like

eo ("Bought 100 shares of AAPL")
e1 ("Bought 100 shares of T")
e2 ("Sold 500 shares of TSLA")

and there exist stateful services subscribing to the events and whose state depends on the events being processed successfully and in the correct order.

There are many things that can go wrong on the subscription side:
  • A subscribing service fails to process an event and is not able to try to re-process it, leading to a contaminated state.
  • A service "successfully" processes the event, but because of a bug in the processing, it actually failed to process it. This is actually equivalent to the first bullet point.
Should the subscribing services have a way of "resetting" their state once such a problem occurs?

For example, let's a service processed e0 and e2 but not e1, let's say because e1 somehow got lost. Maybe the subscribing service keeps a record of events it processed and knows once it sees e2 that it needs to first process e1 and can get it from some service that stores all the events.
 
10,331
3,865
It might be better for the subscribers to pull the requests from a broker in this case so that events are processed in the correct order.

ZeroMQ was made for these kinds on systems:

http://zguide.zeromq.org/page:all

As you read down they will present various architectures for microservice architectures with pros and cons.

http://zeromq.org/intro:read-the-manual
 

Svein

Science Advisor
Insights Author
1,923
579
It might be better for the subscribers to pull the requests from a broker in this case so that events are processed in the correct order.
And there you are in the middle of what I was doing the last ten years of my professional life - the problem of "time stamping" an event with the correct universal time. After all, there might be several brokers - how do you ensure that the time stamp of an event is correct?

Some years ago, I published an insight here (https://www.physicsforums.com/insights/time-synchronization-across-switched-ethernet/) which discussed the clock synchronization problem for various accuracy requirements. For human systems (like the broker problem), the NTP protocol (with an estimated synchronization accuracy of about 2ms) is more than precise enough. The only problem is that the system clock will drift between synchronizations and thus a timestamped event must somehow report the time of the last synchronization and the measured clock drift between the two last synchronizations.

For a more thorough discussions of time synchronization, read the insight.
 
10,331
3,865
I was referring to an MQ broker where producer programs write messages to a queue and consumer programs read messages from the queue in a transactional scheme. In this way if the consumer fails then it can restart and not miss a transaction and process them in the correct order. The transactional feature is important as a message won't be dropped from the queue until the transaction is completed however the feature may slow down the system if the message load is very heavy as in stock ticker systems.

Nice insight by the way, I think MQ systems and database systems have these notions embedded within them at least I'm pretty sure distributed partitioned database schemes need this to work correctly.
 

.Scott

Homework Helper
2,196
698
For example, let's a service processed e0 and e2 but not e1, let's say because e1 somehow got lost. Maybe the subscribing service keeps a record of events it processed and knows once it sees e2 that it needs to first process e1 and can get it from some service that stores all the events.
This is more an issue of "resyncing" than "resetting". In general, it will not be possible to "unservice" e2 - but if that is possible, then you could unwind all transactions since the mis-step. A more likely solution would be to periodically checkpoint your servicer's state. So if I checkpoint at e100, e200, and e300 then discover at e377 that I missed e267, I can go back to e200 and process forward from that point.
It is also possible that you can determine whether the missing event matters anymore. If you are keeping a list of the most recent 20 events, loosing an event before that will not matter.
 

Want to reply to this thread?

"Event-driven system and subscribers missing events" You must log in or register to reply here.

Physics Forums Values

We Value Quality
• Topics based on mainstream science
• Proper English grammar and spelling
We Value Civility
• Positive and compassionate attitudes
• Patience while debating
We Value Productivity
• Disciplined to remain on-topic
• Recognition of own weaknesses
• Solo and co-op problem solving
Top