[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Router dumping core
Yeah, I too spotted Rayan using Toronto systems from
UUNET Canada, but I didn't know he became its President..
> At 8:49 AM 6/28/94, Marco Hernandez wrote:
> >Greetings all:
> >
> > I'm just wondering if anyone has been experiencing their router
> >process dumping core. I am running version 2.90-940626 on a Sparc LX,
> >sunos 4.1.3_U1 ... Zmailer was compiled with gcc 2.5.8 ..
> >
> > Outside of a few warning, compilation went fine ... I am running
> >six routerd, ... before I start lurking into the dumps, I'd like to
> >know if anyone has had this problem ...
I am seeing it too, but beats me what is up..
(careless string-pointers somewhere ?) Maybe this night..
Maybe a problem with rfc822.ssl -- use the old version ?
> Marco -- there are a couple of problems that will occasionally cause a
> stock Zmailer (that is, the Toronot 2.2 version) to dump core and die. Both
> are rare enough that it takes a while to track down:
>
> 1] A header line that starts with just a colon, no keyword, e.g:
> : Foobar.
>
> This has a fairly easy fix (it's just an off-by-one error), which I believe
> Matti has incorporated into the most recent versions.
It is incorporated.
> 2] A name-server CNAME loop -- that is, host A has a DNS entry that
> contains a CNAME record pointing to host B, and host B points back to host
> A.
>
> This Shouldn't Happen(tm), but it does, and because ZM-router does it's own
> recursion over DNS entries, rather than having the resolver library do it,
> the result is an infinite recursion until the stack explodes. I've solved
> this problem by adding a bit of idiot-proofing in the form of a counter
> that gets set to zero just before a name lookup loop starts, gets bumped by
> one every time it recurses, and will return a tempfail and put the message
> on hold if the counter bumps over 25 (based on the theory that no
> legitimate host entry will ever require one to go 25 hops deep to find an
> address).
Well, my latest code has something similar, but the limit is
on 4.. (that is, it tolerates 3 recursions)
I had some problems with CZ-landian networks where there
were two CNAMEs pointer to each other, and routers crashing
after spinning a few minutes on recursion.. (There should
be only one CNAME in any resolution.)
> The core dumps from this turned out to not be directly useful for
> debugging, since they were dumped after the stack was mangled, and thus,
> all of the calling chain was lost.
No, you must be lucky to spot it BEFORE it happens, keep
router on debugger, and break it every now and then..
> Look for old files in postoffice/router (nothing in there should be older
> than an hour or so, so anything that is is probably what caused your core
> dump). Try using gdb over router and run router with '-i' to process them.
>
> I should probably contribute back my fix for #2, shouldn't I...:-)
Well, if you found some new problem :)
> Mikey
>
> --
> Michael Scott Shappe
> CIT Collaboration Systems
> "Me? I'm just a lawnmower. You can tell me by the way I walk."
> -- Genesis, "I Know What I Like", _Selling England by the Pound_
> RIPEM and PGP Public Keys available upon request
/Matti Aarnio <mea@nic.funet.fi>