[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Fix to use the saved errno in scheduler/transport.c
On Tue, Oct 18, 2005 at 11:15:31AM +0100, Alex Kiernan wrote:
> We've had a long standing problem of the scheduler aborting very
> occasionally. I went and stared at the relevant code earlier today and
> it looks like it might be the classic errno getting overwritten by a
> later syscall problem. What's stranger is the code has all the
> machinery to avoid the problem, it just doesn't actually use the saved
> value!
>
> I see the problem so infrequently I can't really test it, but I'm
> hoping this fixes it.
Ouh.. since aeons ago I did remove a bunch of abort()s from
other subsystems, but obviously have left them here..
Use of abort() in code should be reserved to cases of serious
s***t happening; which usually manifests itself as SIGSEGV..
Syscalls resulting with odd error codes is not truly fatal.
I haven't encountered this error at all, which doesn't really
preclude it from being real. (At smtpserver I had a number of
odd cases when error processing did hard failures -- but prolonged
exposure to Solaris as system environment did cure me of most such
false expectations in there..)
I think your fix is at least half-way to it, but true fix is to
make system to be upset only about things that are truly worth
the upset, and otherwise just log and ignore them.
> --
> Alex Kiernan
--
/Matti Aarnio <mea@nic.funet.fi>
-
To unsubscribe from this list: send the line "unsubscribe zmailer" in
the body of a message to majordomo@nic.funet.fi