Postfix Queue Management

Table of Contents | Back: Postfix Performance Results | Next: Dealing with Huge Backlogs

Postfix queue organization

Postfix has four different queues: maildrop, incoming, active and deferred. Locally-posted mail is deposited into the maildrop, and is copied to the incoming queue after some cleaning up. Incoming is for stuff that is still arriving or that the queue manager hasn't looked at yet. Active is a limited-size queue for mail that is being delivered right now. Mail that can't be delivered goes to the deferred queue, and does not get in the way of other deliveries.

Information about the active queue is kept in memory. The active queue size is limited to a multiple of the number of simultaneous delivery processes. This queue size limititation is chosen on purpose. The queue manager should never run out of working memory because of a peak message workload. One of the worst things that can happen is that the mail system gets wedged because the queue no longer fits in memory, and stays wedged until someone throws away enough messages that the queue fits in memory again (or increases the UNIX memory resource limit by a sufficient amount). To avoid such disasters, the Postfix queue manager enforces an upper bound on the amount of memory that it needs, so that it stays in control.

A similar memory-saving trick has been in wide use for decennia. Instead of reading whole files into memory, programs use fixed-size buffers that act as windows onto those files. The size of a file has no effect on the amount of memory needed to read its contents. The Postfix active queue is just a window onto a larger queue.

Postfix delivery strategy: fairness and no thundering herd

Implementing a high-performance mail system is one thing. However, no-one would be pleased when Postfix connects to their site and overwhelms it with lots and lots of simultaneous deliveries. This is important especially when a site has been down and mail is backed up elsewhere in the network.

Postfix tries to be a good network neighbor. When delivering mail to a site, Postfix will initially make no more than two simultaneous connections. As long as deliveries succeed, the concurrency slowly increases up to some configurable limit (or until the host or network is unable to handle the load); concurrency is decreased in case of trouble. For those familiar with TCP/IP implementation details, Vmailer implements its own analogon of the TCP slow start algorithm

Apart from the thundering herd controls, the Postfix delivery strategy is based on round-robin selection and random walks. The queue manager sorts message recipients in the active queue by destination domain, makes round-robin walks along all domain queues, and makes random walks within each domain queue.

On the average, Postfix will do simultaneous deliveries to the same domain only when there is not enough work to keep all outbound SMTP channels busy. So, when AOL goes offline and comes back, it will not stop the system from delivering to other sites.

I think that many mailer benchmarks are misleading because they emphasize the mail exploder function. As I found, delivering messages with only one recipient is disgustingly slow. Even with my own mailer it takes a high-end machine to deliver 1,000,000 different messages in a day. To me, these results felt like "the emperor is wearing no clothes".

Table of Contents | Back: Postfix Performance Results | Next: Dealing with Huge Backlogs