I would guess every programmer knows what queueing is. For those who don’t, well, it’s some sort of waiting line for messages: they get in, they wait, and they get dispatched for processing.
I used queues many times in the past. First in first out (FIFO) queues for for mass email programs, last in first out (LIFO) queues for MicroBlogBuzz, and prioritized queues for a video compression platform, among other things.
After a lot of trial and error, storage engine changes, code optimization, benchmarking and swearing, I came up with this idea of cascaded queueing. In a nutshell, cascaded queueing is putting two queues back to back: a master queue that will act as a warehouse, and one or many slave queues that will act as distributors or brokers.
Available Queueing Methods
The queue messages can be stored in different ways, and each have their pros and cons. The goal of this post not being to describe queueing, but rather cascaded queueing, here’s a quick analysis of the most common ones.
- File system: Quick, persistent and flexible, but it’s locally hosted, and it requires a file system that supports file locking.
- Database (ie. MySQL): Persistent, networked and possibily distributed, but slow under high loads due to locking.
- Memory (ie. Memcached): Super fast for simple uses, can be distributed, but non persistent, and it’s simple nature renders prioritized queueing a bit complex.
There also are full blown queueing solutions providing an API to enqueue and dequeue messages:
- Message Queueing Platforms (ie. Apache ActiveMQ): Full blown, but too heavy and/or complex for most simple uses.
- External Message Queueing (ie. Amazon Simple Queue System aka SQS): Robust, simple and distributed, but on a pay per message and pay for bandwidth model, and there’s a REST payload for each request.
The main idea behind cascaded queueing is to attenuate the cons of one queueing storage (poor performance or increased payload) by using two queueing methods simultaneously. With cascaded queueing, you would choose a master queue method for robustness, and the slave queue methods for performance.
Theory Of Operation
As with single queue queueing, all incoming messages get queued in the master queue.
However, the slave queue(s) communicate with the master queue, in order to make sure they keep enough messages locally to provide the processing nodes with them in a timely fashion. So they lock messages in the master queue, and host them locally. There can be a great performance gain here if the master queue is more efficient to locks and send messages in batch.
The processing nodes then lock and request messages to the slave queue. If the locks are only valid for a given amount of time, the slave queue must ensure that the processing node’s lock on the slave queue message must not exceed the slave queue lock on the master queue message. The slave queue may extend the lock on the master queue message if possible, or simply not return that message.
Finally, upon deletion of a message from the slave queue by the processing nodes, the deletion is cascaded to the master queue so the message is definitely removed.
A slave queue data loss will not cause any message to be lost, as they still exist in the master queue. The slave queue can therefore be shut down and rebuilt at anytime, which makes the use of Memcached as the slave queue backend storage possible.
Test Driving It
I first used this technique with MySQL as storage engine for the master queue, and Memcached as storage for the slave queue.
The big advantage of MySQL is that getting 200 messages isn’t a lot slower than getting 1: the locking takes pretty much the same time. However, the average processing time of each of my messages was quite high, and so was the standard deviation: processing a message took 15 seconds on average, but could take anywhere between 1 second and 60 seconds. Thus, processing nodes locking 200 messages at once often resulted in inefficient message distribution among them.
On the other hand, even if I find that Memcached is extremely reliable, it’s non persistent by definition. If you loose one of four nodes, you lose the quarter of your messages. And it’s even worst if your memory gets full: the oldest messages are deleted. But it’s extremely fast, as there’s no disk IO.
Cascading MySQL to Memcached improved the queue performance a lot by removing almost all MySQL inter-locking dead times . Messages were be transferred in batches of 200 to the slave queue every time the queue was at 100 messages or lower, and they were be dequeued one by one by the processing nodes.
As I only had one processing environment, I hosted my slave queue and my master queue at the same place. But one could imagine a slave queue per processing environment as well.
Obviously, requeueing everything a second time, extending locks and cascading delete requests means extra payload and processing time. Moreover, if not implemented properly, you run the risk of seeing your slave queue repeatedly go empty, having your processing nodes waiting for it to be refilled.
Therefore, cascaded queueing is not the answer to all queueing problem, but rather a way to work around some problems encountered with some queueing methods.
If you like this article, leaving a comment, tweeting ofr liking it is always appreciated.