[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Router dumping core



Yeah, I too spotted Rayan using Toronto systems from
UUNET Canada, but I didn't know he became its President..

> At 8:49 AM 6/28/94, Marco Hernandez wrote:
> >Greetings all:
> >
> >        I'm just wondering if anyone has been experiencing their router
> >process dumping core.  I am running version 2.90-940626 on a Sparc LX,
> >sunos 4.1.3_U1 ...  Zmailer was compiled with gcc 2.5.8  ..
> >
> >        Outside of a few warning, compilation went fine ...  I am running
> >six routerd, ... before I start lurking into the dumps, I'd like to
> >know if anyone has had this problem ...

	I am seeing it too, but beats me what is up..
	(careless string-pointers somewhere ?)  Maybe this night..
	Maybe a problem with  rfc822.ssl -- use the old version ?

> Marco -- there are a couple of problems that will occasionally cause a
> stock Zmailer (that is, the Toronot 2.2 version) to dump core and die. Both
> are rare enough that it takes a while to track down:
> 
> 1] A header line that starts with just a colon, no keyword, e.g:
> : Foobar.
> 
> This has a fairly easy fix (it's just an off-by-one error), which I believe
> Matti has incorporated into the most recent versions.

	It is incorporated.

> 2] A name-server CNAME loop -- that is, host A has a DNS entry that
> contains a CNAME record pointing to host B, and host B points back to host
> A.
> 
> This Shouldn't Happen(tm), but it does, and because ZM-router does it's own
> recursion over DNS entries, rather than having the resolver library do it,
> the result is an infinite recursion until the stack explodes. I've solved
> this problem by adding a bit of idiot-proofing in the form of a counter
> that gets set to zero just before a name lookup loop starts, gets bumped by
> one every time it recurses, and will return a tempfail and put the message
> on hold if the counter bumps over 25 (based on the theory that no
> legitimate host entry will ever require one to go 25 hops deep to find an
> address).

	Well, my latest code has something similar, but the limit is
	on 4.. (that is, it tolerates 3 recursions)

	I had some problems with CZ-landian networks where there
	were two CNAMEs pointer to each other, and routers crashing
	after spinning a few minutes on recursion..  (There should
	be only one CNAME in any resolution.)

> The core dumps from this turned out to not be directly useful for
> debugging, since they were dumped after the stack was mangled, and thus,
> all of the calling chain was lost.

	No, you must be lucky to spot it BEFORE it happens, keep
	router on debugger, and break it every now and then..

> Look for old files in postoffice/router (nothing in there should be older
> than an hour or so, so anything that is is probably what caused your core
> dump). Try using gdb over router and run router with '-i' to process them.
> 
> I should probably contribute back my fix for #2, shouldn't I...:-)

	Well, if you found some new problem :)

> Mikey
> 
> --
> Michael Scott Shappe
> CIT Collaboration Systems
> "Me? I'm just a lawnmower. You can tell me by the way I walk."
>         -- Genesis, "I Know What I Like", _Selling England by the Pound_
> RIPEM and PGP Public Keys available upon request

	/Matti Aarnio	<mea@nic.funet.fi>