[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ZMailer scheduler stuffing up severely...



On Fri, Oct 12, 2001 at 03:34:27PM -0700, Michael Loftis wrote:
> Matti Aarnio wrote:
> > On Mon, Oct 08, 2001 at 02:06:15PM -0700, Michael Loftis wrote:
> > >
> > > OK my system is doing deliveries but not at what it is capable of.
> > > The routers seem to be picking up from the router dir plenty fast
> > > enough, but scheduler fallls *massively* behind in processing queue.
> > > If I stop zmailer, then restart it (witht he synchronous start option)
> > > it reads in the queue and *FLIES* just fine.
> > > But during normal operations the damned thing just stuffs up.
> > > Whats the deal?
> >
> >    Which version of ZMailer ?  At what kind of system ?
> >    ( UNIX version, and hardware, amount of RAM. )
> 
> Zmailer 2.99.55
> FreeBSD 4.3-p20
> 2xPii450 512MB RAM

  Similar size machine is used by my employer at primary customer
  email hub relay.  24h average is about 8-10 mails per second.
  (It is 2xUltraSPARC with Solaris, but otherwise similar size.)

> >    I think I have seen this jamming phenomena once or twice awhile ago.
> >
> >    Ah! Recollection hits!  I guess the bug is due to crash of "timeserver".
> >    But why it crashes/stops ?  And why won't it recover ?  Can you tell
> >    me more about your system ? (Hardware, amount of memory, UNIX version ...)
> 
> timeserver?  Well the scheduler keeps going, but it doesnt' seem to be
> picking up new messages that are dropped off by router, or if it is it's
> going about it majorly slowly.  IT's alwo very slow at getting them out.
> We've routinely seen the thing break 1Mbit a second at a run for 10-15
> minutes or more, but seems that things have stuffed up at times though.

   The input queue scanning is done timed, and when that time ticking
   stops, the main control loop will never reach next time to do the
   scan...

   Also the whole scheduling of pushing new things (or NOT forking new
   processes) is controlled by the advancement of that clock.
   When it stops, all time scheduled events freeze, and timed quotas
   saturate when the timer does not advance, and ...

   A friend of mine said that he was impressed when one system kept
   pushing queued email to him over a trans-atlantic connection at
   full wire speed (including the round-trips!).  Does he have
   0.5 or 2.0 meg link, I don't remember, but Z was pushing, and
   Exim was receiving.

> >    ZMailer scheduler contains a shared-segment subprocess which ticks
> >    2-3 times a second, and updates gettimeofday() data in that segment.
.... 
> Ahhhh!!!!  OK!  I've got it then!!  We're running a fairly high process
> server and probably you're eating us out of shared memory segments!

   Actual way to implement is trivialish:
		mmap(NULL, mapize, PROT_READ|PROT_WRITE,
		     MAP_ANONYMOUS|MAP_VARIABLE|MAP_SHARED, -1, 0)
   and then
		fork()
   where the child is the ``timer'' process.
   Which does work usually.

   On systems without  MAP_ANONYMOUS, and always on Linux the code
   opens a temporary file, writes it full of zero, and then mmap()s
   that file into memory before forking.

   The FreeBSD definitely isn't Linux in this case.
   Does FreeBSD have  MAP_ANONYMOUS ?


   There are alternate ways to implement this timer thing without
   loading the system too much with regular things.
   Chief-contender for usable replacement is ``setitimer(2)'' syscall.
   (But not tonight, too much wine at the dinner for coding..)

   If you want a try to make that alternate time ticker, look for
   scheduler/scheduler.c:  init_timeserver()  and  mytime()   functions.

   It should be essentially simple task, initialize SIGALRM signal
   handling, prime the itimer with  setitimer(  ITIMER_REAL  ) and
   rearm it at every trigger.


   You could also instrument the  mq2  to detect non-advancement of
   the  mytime() (its result being more than 3-5 seconds to the
   past ?)  and trigger some recovery method to restart it.

> > <THWACK!>
> >    Death of timeserver ?
> 
> Sounds like it, I'll make some adjustments to shared memory and PMAPs and see
> what we can get.  PRobably need to beef up NMBCLUSTERs as well.
> 
> *todders off to do kernel tuning*  IF anyone has any recommendations
> I'd love to hear them.

-- 
/Matti Aarnio	<mea@nic.funet.fi>
-
To unsubscribe from this list: send the line "unsubscribe zmailer" in
the body of a message to majordomo@nic.funet.fi