ZMailer Deliveres Slowly ?

This is a email reply to a question at the ZMailer list:

>  Hi, I'm using Zmailer 2.99.50-s19, and everything seems to work, but I've
> noticed that it's a bit slow:
>
> Jun 29 09:36:00 calix router[2257]: S.rS5RM104048:
>       from=, rrelay=dani@localhost size=327, nrcpts=1,
>       msgid=<19990629093552.A2361@calix.enpl.es>
> Jun 29 09:36:10 calix mailbox[2367]: S.rS5RM104048: to=,
>       delay=00:00:18, xdelay=00:00:00, mailer=local, stat=ok3 Ok
>
>  I don't understand those 18 seconds of delay (sendmail usually gives 0-1
> seconds..). This delay goes from 7 to 30 seconds.. I'm I doing something
> wrong? I'm using bind-8.1.1 as the nameserver. And I'm using zmailer in a
> Sparc Ultra 1 and a PPro 200 (same results).

Yes and no. Sendmail is monolithic program doing all by itself from message submission to driving the final delivery. Thus sendmail is fast for doing its job, but on the other hand, when there is *lots of* separate emails to deliver, you can all the sudden have lots of running 'sendmail' processes, and then your system load skyrockets without limits...

ZMailer is designed around separate servers which do their parts of the job, and pass the task along to next one. This comes in part from security desire of not letting user started program to do routing and (worst of all) final delivery.

ZMailer's components check for new jobs only every about 10 to 20 seconds, or so. With multiple routers running that first step delay often pretty much disappears, but the scheduler is a single thread program and as such its scan interval in "new-job" directory checks does show up. Then the scheduler gobbles up some limit number of new jobs, and starts them, and lets rest wait for next interval. That limit being highish enough that only under most unusual (high load) circumstances does it matter at all.

When the scheduler then picks the job up, there goes some time before the freshly started transport agent is up and about, and announces itself being "#hungry" to accept a new job -- your delivery. (System load depending, but presume some message arrives which starts 50 parallel TA agents into the worker-farm, there comes quite a load spike, but this TA-agent - scheduler interaction interface does smooth it.)

At times I have been biffed with new email so fast that I hardly have raised my finger from the enter key by which I sent it.. But these delays being statistical animals, they should be pretty much evenly distributed in between zero, and router-interval plus scheduler-interval. About that 30 seconds.

Also, in routers these delays are in play only when the system is idle, so that ZMailer isn't churning your disks unnecessarily. When you have lots of email, routers are in constant activity, and scheduler takes lots of stuff to its workset every scan interval. Unless routers have massive backlog, the through- put delay goes down to average half of scheduler interval... (Delivery delays are another story entirely, of course.)

Even with these delays in rescans, and various other issues which ZMailer components try to do to avoid overloading your machines, it is easy to exceed million email recipients per day on a 20-30 MIPS machine.

2003-Mar-14: Note about NOTIFY mechanism

Since February 2002 there have been experiments with a datagram socket transmission mechanism that sends a note about new jobs to router, and which router can use to notify scheduler.

That mechanism has speeded up low activity speeds considerably, while high-activity has not been hampered in any obvious way.


Matti Aarnio <mea@nic.funet.fi>