[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: scheduler log entries



	Marco presented to me a weird looking error log,
	I am quite tired after 20 hours of debugging, but
	lets see if I can make any sense..

> Here are some very strange log entries (scheduler)

	Hmm.. Screen capture, or emacsed piece from the actual file ?
	I think it was screen capture.   CRs do create funny effects
	in that case..

	Anyway, I was able to figure out the problem behind that
	hold/UUUUUU -case.  (Well, I think it is behind that one too..)

> os:rom owner-frognet-@list.cren.net 
> deferredafter 3 days, problem was:rom owner-frognet-@list.cren.net 
> deferredafter 3 days, problem was:pa.us./a from owner-frognet-@list.cren.net 

	CRs fold the display.  Look at it with 'less' to REALLY see it :)

> HELP! Lost 93 bytes (n=0/0, off=84): 'hold: Cannot open control file "hold/11403
> " from "/var/spool/postoffice/scheduler" for "hold/UUUUUUUUUUUUUUUUUUUUUUUUUUUUU
> UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU
....
> UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU
> UU€'

	Here the malloced buffer with target "host"-entry has been cleared,
	but it is accessed anyway...

	It all relates to the following:
	
> HELP! Lost 93 bytes (n=0/0, off=0): 'hold: Cannot open control file "hold/11794"
>  from "/var/spool/postoffice/scheduler" for "hold/'
> HELP! Lost 93 bytes (n=0/0, off=0): 'hold: Cannot open control file "hold/11365"
>  from "/var/spool/postoffice/scheduler" for "hold/'

	[ The stats report from hold-channel was not complete, the missing
	  trailing newline causes that to be reported.. ]

	There was a piece of email with multiple addresses, now it
	happens it became linked into the system with paths of style:
		channel/host/idnumber
	however when the "vertex" containing this particular entry is
	scheduled for transport, the method used in the internal
	scheduling did not pick the same paths as the one doing
	linking into directories..  It rather picked the first of
	the multiple ones, and THAT was free()ed a bit earlier, when
	the hold/"host"-entry was processed a bit before it...
	('U' = 0x55 - marker on the UofToronto's malloc() for free()ed
	 memory buffer..)

	Therefore ALL transporters were called with reference to one
	particular instance of the links, and if it became processed
	AND DELETED before the others got submitted, and eventually
	"arrived" into processing, THEN there would appear "Cannot open
	... CHANNEL/HOST/IDNUM ..."


	This did manifest itself clearly when the scheduler is used
	with "-s" (subsubdirs -mode), however also without it the
	things were not quite smooth, especially when emails went
	both to the local-, and to smtp-channels, for example.

> Misformed diagnostic: ld.so.1: /usr/local/mail/bin/ta/smtp: fatal: unable to pro
> cess PLT entry at 0x16930 0x0: bad entry offset
> Misformed diagnostic: 
> Misformed diagnostic: ld.so.1: /usr/local/mail/bin/ta/smtp: fatal: unable to pro
> cess PLT entry at 0x16930 0x0: bad entry offset

	???? Something is seriously wrong with your machine,
	     starting the SMTP-transport binary crashes with
	     dynamic link/loading (ld.so.1)

	What your   LD_LIBRARY_PATH  is set to, when the scheduler starts ?


> Misformed diagnostic: 
> deferredafter 3 days, problem was:du.au/cname from owner-aahesgit@list.cren.net 

	Umm.. No idea where that "Misformed diagnostic:" belongs to..
	Must have been the CR-originated mirages/display mangle from
	above.

> ANy hints ..

	Yup.  2.99.12 is now at ftp.funet.fi, and has that fix in
	the scheduler.  :-)

	Also with the scheduler now fixed (? -- will I ever learn
	not to be excessively optimist?) I am ready to call it time
	for the  ZMailer 3.0 -- perhaps I will spend a month fixing
	"small" bugs people report, and write an updated documentation.

> /Marco

	/Matti Aarnio	<mea@nic.funet.fi> <mea@utu.fi>