[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: weekend datadump at mea.tmt.tele.fi
[talking of smtp becoming stuck, SIGALRM ineffectivity]
> Interesting ... I never upgraded to 2.99.48 on my Solaris systems because
> I never saw the stuck 'smtp' processes as on Linux. I am still happily
> running 2.99.47 here. Also, since running 2.99.48p3 on Linux, I have not
> seen the stuck 'smtp' processes. Just a datapoint ... but it would seem
> to be saying that something that's recently changed is causing the
> problem.
The Solaris is weird. Same code works fine on Linux,
SGI, and FreeBSD, but becomes stuck on Solaris...
Recently our main mail-relay's smtp-server caused an alert
on the periodic service quality monitor by presenting following
at the log (and doing an exit() after it..):
00000# started server pid 460 at Thu, 3 Apr 1997 17:51:59 +0300
000000# accept(): No child processes
00000# started server pid 27408 at Sat, 5 Apr 1997 18:57:16 +0300
The smtpserver did until then consider such a major
catastrophe, but then I decided that as the accept()
at Solaris can get weird errors not listed at its
documentation, who am I to argue ? Sigh...
Perhaps the freeze-out of the smtp-client is similar
Solaris weirdo... Hmm.. or..
It is possible that the Solaris uses POSIX sigaction(),
and does REQUIRE usage of flag SA_NODEFER. The ALRM
handler does after all do a longjump, and will not return
from the signal handler per se...
In such a case the first timeout works, and the second one
jams..
> Roy
> rcb@press-gopher.uchicago.edu
/Matti Aarnio <mea@nic.funet.fi>