[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [ZMailer] Zmailer crashes
On Wed, Feb 04, 2009 at 03:09:55PM +0000, Ralf Baechle wrote:
> > On Fri, Jan 30, 2009 at 03:32:31PM -0800, Neal Morgan wrote:
> > > > On October 31, 2008 9:03 AM Ralf Baechle wrote
> > > > Since quite a while I'm observing these kernel messages on a Linux x86_64
> > > > system:
> > > >
> > > > sm[3270]: segfault at 3ba7f9f0 ip 79fbc9 sp 7fffe7c48e30 error 6 in
> > > libc-2.7.so[72d000+14d000]
> > > > sm[3493] trap stack segment ip:7f0e2a121bc9 sp:7fff3240e4a0 error:0
> > > > sm[3773]: segfault at 3ba7f9f0 ip 79fbc9 sp 7fff55499680 error 6 in
> > > libc-2.7.so[72d000+14d000]
> > >
> > > Matti: I've been seeing these across 4 servers:
> > >
> > > kernel: smtpserver[31693]: segfault at 00000000 eip b7c16371 esp
> > > bf94b018 error 4
> > >
> > > kernel: router[9934]: segfault at 00000008 eip 0807fa95 esp bfdf5570
> > > error 4
> > >
> > > The interesting thing is it only happens when booted into a 2.6.24
> > > kernel. If I reboot the same box into a 2.6.18 kernel everything runs
> > > fine (and there are no segfaults).
>
> Older kernels don't emit this segfault message. It was added in
> commit abd4f7505bafdd6c5319fe3cb5caf9af6104e17a that is for 2.6.23. Could
> that be why you didn't notice it earlier?
>
> > I do see them too with 2.6.26 kernel at zmailer.org server.
> > A few hits per week according to kernel dmesg logs.
> >
> > I suspect more about glibc doing something stupid, than program really
> > going over the edge, but these are so rare that debugging them is next
> > to impossible. Previously I have seen them happen after the program
> > has called exit(0).
> >
> > Anyway I have turned on core dumps to be able to see what happens.
>
> I've seen Zmailer stopping mail delivery or stopping accepting connections
> on port 25. The issue is hitting relativly infrequently but I decieded to
> follow your example and just turned on core dumps; it is affecting sm,
> smtpserver and router. Lately the frequency of this issue striking
> seems to have increased significantly - I wonder if that's due to me
> looking more frequently after it or due to my extremly inflated mail
> queue with over 1,700,000 stored messages.
>
> Ironically I seem to have gotten another router segfault just seconds
> before I enabled core dumps ...
To close this old case - the issue went away for me after upgrading the
system from Fedora 8 to Fedora 10. So I assume there indeed as suspected
by Matti was something toxic in glibc.
Ralf
--
To unsubscribe from this list: send the line "unsubscribe zmailer" in
the body of a message to majordomo@nic.funet.fi