FIFO, Exactly-Once, and Other Costs
After reading an article entitled FIFO, Exactly-Once, and Other Costs, published on bravenewgeek.com, I was left wondering if there is indeed a way of achieving a compromise on the topic.
The problem domain is delivering one message, reliably, exactly-once, from one machine’s process to another machine’s process (or between processes on the same machine.)
As discussed on the above blog, this domain is mostly implemented as serial publish and subscribe between the two processes, with no concurrency. While this method of implementation is an obviously desired messaging pattern; it does not scale, in the concurrent/parallel sense.
I’d like to walk you, the reader, through an implementation which fosters the concurrently handling of messages in a FIFO, exactly-once, messaging queue.
A method to achieve concurrent, FIFO, exactly-once
The basis of handling all messages by means of serialized delivery is to keep messages from mutating the same item, simultaneously. At least this is the understanding I have on the topic, which is based on previously accomplished work which used them.
However, since the above statement described the “heart” of the domain; it could be stated that concurrently processing identical items is the reason for which the current messaging queues do not in-fact support concurrency (w.r.t. FIFO, exactly-once queues.)
This leaves the possibility of concurrently processing differing items, but serially processing identical ones.
To achieve this task, I propose that the message queueing system implement a primary key on all messages placed in to a single FIFO queue.
In turn, this defined primary key (and/or a composite key) is used to keep other concurrent queue consumers from processing messages that mutate the exact same message, as an already in-flight one being consumed. It is important that when defining a composite key, that the developer may specify that the composite key may be unordered. This permits later described matching on in-flight messages matching each of the composite key values in either position. This is desirable in financial systems where two keys are an entity, which may appear in either a “from account” or “to account” composite key.
When a publisher first creates a queue, they specify not only the queue semantics; but, also the message structure and the fields which compose a primary or composite key. This can be achieved a number of ways, and supports a number of programming language agnostic message structures. As an example, if JSON were the message structure, the publisher states this fact and specifies which of the JSON fields comprise the key used by the MQ for concurrent processing.
Example
In short, the process works as follows:
- A queue is created to consume messages of the form
{src_id, dst_id, type, amount}
(Erlang-style tuple). In this structure, the following describes each field:src_id
is an integer representing a source or the from account in a financial database.dst_id
is an integer representing the destination account.type
is an enumeration-type field whereby the values are DEBIT or CREDIT.amount
represents a real representing an amount of money that is either debited or credited from each of the accounts.
- All existing FIFO, Exactly-Once semantics apply; as is always done currently. In brief, this means that the publisher receives an ACK message; and the consumer must ACK the message after processing.
- The diverging point from existing MQ systems is within the the new MQ system placing all outgoing messages to consumers. As messages are about to be dispatched, the MQ performs the following:
- Maintain a table of in-flight
src_id
plusdst_id
fields, unordered. In short it means the MQ keeps serial semantics across all messages with the same two accounts in order, or swapped. - If a composite
id
is already in-flight, skip to the next message for delivery (so in short, selective message dispatch to consumers.) - Once a consumer ACK’s a message, mark message complete, remove in-flight status, and discard. At this point, send any message remaining in the MQ for which is possible; which may be an older, skipped message which is now deliverable; as it is not an in-flight composite
id
.
- Maintain a table of in-flight
- As tons of producers send many credits and debits to occur, this methodology keeps concurrency across all own accounts the financial institution has - with the MQ only serializing credits and debits which modify the same two accounts, in either direction.
Final Thoughts
With the described financial system, is the problem domain finally resolved for MQ FIFO, exactly-once, concurrent message queues?
I’d like to hear your feedback in the comments. Please only comment with constructive criticism and be kind to others.
Thank You,
-Joseph Benden