[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Router locking mechanism breaks




> 
> > Hi,
> > 
> > ZMailer router 2.99.38 on SunOS 5.5 --with-bundled-libresolv --with-gcc 
> > compiled option.
> > 
> > Intermittently, we found 2 routers processing the same message file and
> > produced duplicate mail.   From the /var/log/mail/router,  I found that
> > 
> > <"message ID ....">: file: 102429-17756 <xxx> => ....
> > <"message ID ....">: file: 102429-17757 <xxx> => ....
> > 
> > almost logged at the same time.  It seems that the multi-router locking
> > mechanism "inode-pid" breaks.  Actually, I don't know whether it helps,  but
> > most of the incidents found are submitted by smtpserver.  The problem was
> > found in 2.99.26 and that's why we upgraded to 2.99.38, but the problem still
> > not solved.  ;(
> 
> The reason for this breakage may become revealed with following:
> 	zmailer-src> egrep RENAME config.h
> 
> If it is: "#define HAVE_RENAME 1"  then I am rather mystified.

Yes, HAVE_RENAME is #defined.

I'm also mystified for that.  Two router processes pick the same message file
and put 2 seperate pairs of control file and message file in transport/ and
queue/ directories respectively and eventually duplicate mail are received.

Is rename() in Solaris 5.5 broken?  Any help if I #undef HAVE_RENAME and let
the program choose link() and unlink() instead.   I think I'll try for that
anyway.

BTW, it seems the version has been upgraded to 2.99.43b.  Mea,  Has you
handled the problem in somewhere between 39 - 43b?   If so,  I'd rather try
for that new version. (Actually, I was quite happy with .38's stability
except this annoying bug! :(

> 
> > Moreover, the IDENT module on 2.99.38 smtpserver is not working.  The logfile
> > always logged "connection from UNKNOWN@xxxx ..." even though the connecting
> > hosts has "in.identd" enabled.  In contrast, 2.99.26 smtpserver has no such
> > problem.   I notice some function naming changed for "libauth" to "libident"
> > in somewhere between #26 to #38.   Is that the problem?   Actually, I tried
> > #34 smtpserver and found not working as well.
> 
> 	Ah, that was a tough nut to crack.
> 
> 	At  smtpserver there are TWO calls of:
> 		setrfc1413ident(0);
> 	change the SECOND one to:
> 		setrfc1413ident(msgfd);

Problem solved, Thanks!

> 
> > =======================================================================
> > Lai Yiu Fai                       |  Tel.:       (852) 2358-6202
> >  & Telecommunications             |  E-mail:     ccyflai@uxmail.ust.hk
> 
> 	/Matti Aarnio <mea@utu.fi> <mea@nic.funet.fi>
>