[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Scheduler is losing track of successful deliveries (solution)



I've solved my problem, and I figured I'd share it with the rest of the
list since I know several other sites are running zmailer on SGI boxes. 
This problem may not even be specific to IRIX 4.* ...

On Fri, 5 Feb 1993, Andy Poling wrote:
> I'm seeing this problem consistently.  After a transport agent has
> successfully delivered a message (shown by a "r+" in the ctl file) scheduler
> sometimes seems to miss it and keeps re-trying it.  This is a pain because
> all the transport agent does is ignore it, saying nothing to scheduler which
> promptly tries it again.  I'm losing a significant number of cycles to this
> looping mis-behavior.  

IRIX 4.* uses FNONBLK as part of its POSIX compliance instead of the
traditional BSD FNONBLOCK, so transport was getting compiled to use
FNDELAY.  With FNDELAY, an empty pipe returns 0, which the code in
transport.c takes to mean EOF.  It was never going back to read anything
more from the pipe - it went to wait() on the child.  

This also explains the problem I was seeing with scheduler waiting on a
child which was blocked writing into the pipe (which was full). 

Adding the following code near the top of transport.c (anywhere after
fcntl.h is included) solves the problem:

#ifdef FNONBLK
#define FNONBLOCK FNONBLK
#endif

> -Andy

I still think it would be a good idea for scheduler to do some kind of
sanity check to see if it is repeatedly retrying the same address with no
diagnostics in return - it can burn a helluvalot of cycles constantly
forking and execing transport agents... believe me.  This little episode
put a serious strain on my process accounting system by generating over
five times the usual number of process accounting records in the course of
a day.  :-)

-Andy

Andy Poling                              Internet: andy@jhunix.hcf.jhu.edu
UNIX Systems Programmer                  Bitnet: ANDY@JHUNIX
Homewood Academic Computing              Voice: (410)516-8096    
Johns Hopkins University                 UUCP: uunet!mimsy!aplcen!jhunix!andy