[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Too slow ?

On Thu, Mar 13, 2003 at 08:29:23AM +0100, Tomasz Nowak wrote:
> Matti Aarnio( 2003-03-13 00:52 ):
> MA> On Wed, Mar 12, 2003 at 08:25:26PM +0100, Tomasz Nowak wrote:
> >> Hi,
> >>   My problem is:
> >>     1047492376 85723-12310 2212 12 ok3 local/...
> >>   I have too much lines like this in scheduler.perflog.
> >>   How can I find a bottle-neck ? Any hints ?
> MA>   2212 seconds from message arrival to its routing ?
> MA>   (Key is given in  STATISTICS LOG FORMAT  of  scheduler(8) man-page.)
> MA>   That is definitely quite a lot..
>   Yes ! :-)
>   I made raport from 24h:
>   Number of mails: 173728.

The "over 1 minute before routing" fraction was around 13550 messages
in that case.  Quite a lot.    
>   I get only first time from scheduler.perflog without delivery time...
>   This is not good :-(
> MA>   Do you have lots of files in $POSTOFFICE/router/  directory ?
> MA>   ("mailq -ss"  will tell.)
>   No.
>   It's 8:15 now. Why is 288650.1210200 waiting ?
>   I have a little modification of router. I changed source directory
>   for router (#define in router.c). I replaced directory "router" with
>   "clean" and I make antyvirus checking.
> -rw-r--r--    1 root     root         4676 Mar 13 08:12 288650.1210200
> -rw-r--r--    1 root     root         3903 Mar 13 08:12 288650.1210201
> -rw-r--r--    1 root     root         1411 Mar 13 08:13 288650.1210211
> -rw-r--r--    1 root     root         6097 Mar 13 08:24 288650.1210304
> -rw-r--r--    1 root     root         2481 Mar 13 08:24 288650.1210306
>   It's 8:24 now. Do you see 288650.1210200 ?


  Those filenames are not what ZMailer would expect.  There could
  (will in fact!)  be considerable internal trouble due to that -- there
  is only one job in queue (and in processing) at any given time, and
  even that can wreack havock to one already processed, when second
  message arrives with same identifier.

  The router picks jobs into its internal job-queue, and identifies
  them with a long-integer picked up by scanning that file name for
  a number.  Now all of those are same:  288650

  Read about the Queue:

  If those file names from your virus scanner are stat(2) result
  fields   st_dev "." st_ino    I suggest you modify the output
  to be just the  st_ino.

  INSTEAD of having a separate input-spool from which an anti-virus
  scanner runs the messages, there could be synchronous anti-virus
  scanner running as commanded by ZMailer router processes:

  That way the message names need not to be changed, queue processor
  is in ZMailer router, etc.  The only challenge is to have a scanner
  that accepts running from under another process, instead of being
  a long lifetime server itself.

  I don't have any experience about these scanners, could you supply
  me some pointers / descriptions of how you did it ?
  Possibly send it as a separate letter to the list, and title
  it e.g. "How I did integrate Virus Scanner to ZMailer" ?

> [...]
> MA>   In   syslogged  data, you can find items logged by  router,
> MA>   as well as transport-agents.
> MA>   There are   delay=   and   xdelay=  time stampts.
> MA>   If you see  router[...]  produced entries where  delay=  is
> MA>   high, then there is definitely something odd in message feeding
> MA>   to routing.
> Mar 13 07:59:53 ... router[14651]: S80592AbTCMG7w:
> from=<...>, rrelay=STDIN (...), size=2482, nrcpts=3,
> msgid=<20030313065934Z288641-29500+29676@...>
>   I havent delay and xdelay. Why ?
>   My zmailer is: zmailer-2.99.55

  Possibly because those were not there in 2.99.55 ?
  This bit is added in 2.99.55-patch1:

revision 1.9
date: 2001/02/23 22:15:46;  author: mea;  state: Exp;  lines: +13 -10
Router syslog() got  delay=  and  xdelay=  time markers
listexpand routine got tweaked a lot.. (for vger.kernel.org uses)

  (CVS helps, when I remember to put well describing log messages...)

>   I try to put into router.cf line: trace bind resolv, but it doesnt
>   work... Why ?

  That command is for interactive use only.  There is no real
  sane reason to use it in running production scripts.  I think
  it outputs to  stderr,  and of that I am not quite sure, if it
  does get collected into router "console" log-file.
  ...  reading the relevant code:  It should work.

> Pozdrawiam.
> -- 
> Tomasz Nowak     TRIGER - Systemy Komputerowe   http://www.triger.com.pl

/Matti Aarnio	<mea@nic.funet.fi>
To unsubscribe from this list: send the line "unsubscribe zmailer" in
the body of a message to majordomo@nic.funet.fi