[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Strange hanging of zmailer



On Wed, Jan 12, 2000 at 05:58:39PM +0100, Andras Micsik wrote:
> Hi, 
> 
> We are using zmailer 2.99.51-pre2 with Solaris 7. Generally it forwards
> messages almost instantly, but with some hosts (mostly with a NextStep)
> very often mails get stuck for an hour on the server. I found a strange
> situation, which may be the cause of this. zmailer regularly hangs for
> some minutes while sending a message. Here is a network trace of what
> happens:

  This trace shows quite normal half-duplex interchange.  The amount of
  data is fairly small in the transmission, and goes out in one write(2).
  The data ending dot receives its '250' reply just fine.

> receiver -> sender       SMTP R port=44756
>       sender -> receiver SMTP C port=44723 MAIL From:<micsik@sender
> receiver -> sender       SMTP R port=44723
> receiver -> sender       SMTP R port=44723 250 <micsik@sender
>       sender -> receiver SMTP C port=44723 RCPT To:<micsik@receiver
> receiver -> sender       SMTP R port=44723 250 <micsik@receiver
>       sender -> receiver SMTP C port=44723 DATA\r\n
> receiver -> sender       SMTP R port=44723 354 Enter mail, end
>       sender -> receiver SMTP C port=44723 Received: (from loca
> receiver -> sender       SMTP R port=44723 250 Ok\r\n
>       sender -> receiver SMTP C port=44723
> 
> ... and then nothing happens for several minutes.

  Quite so, the TA process has moved into IDLE pool of the thread group,
  and it leaves the connection on for some minutes - if nothing comes
  for sending in some time, the connection will be closed.  The TA process
  may also be moved to other thread, in which case the TA will close
  previous connection, and open a new.

> The scheduler logs:
> 
> 20000112172711 DBGdiag: # smtpclient:20935: bytesleft: 1
> 20000112172711 DBGdiag: # smtpclient:20935: Premature EOF in Z/141725-219!
> 20000112172711 DBGdiag: # smtpclient:20935: Truncated or illegal control
> file "Z/141725-219"!
> Resyncing file "Z/141725-219" (ino=141725) (of=1 ho='receiver') ..
> resynced!

  This is something else.  It may be indication of POSTOFFICE spool area
  having become 100% full at some point, and TA control file write has
  then failed incomplete.

> What can I do against this?
> -----------------------------------------------------------
>   Andras Micsik              micsik@sztaki.hu
>   MTA SZTAKI Hungary         http://www.sztaki.hu/~micsik

-- 
/Matti Aarnio	<mea@nic.funet.fi>