[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Reading envelope in process.cf



On Tue, May 06, 2003 at 11:38:07PM +0200, Marek Kowal wrote:
> Hi there,
> 
> I've been playing around with zmsh, namely I tried to read in the envelope
> part of the message within proecess() code (to find out all receipients of
> the message). So I typed in the following code in process.cf:

  I have had problems with this particular kind of constructions myself.
  For some odd reason if the whole processing (while read ... ; do ..; done)
  is wrapped into a subroutine, and that routine is called with IO rediction,
  those problems disappear.

  Also, do remember to declare local variables ;-)
  (And you could do it all with one  ssift  instead of  case.)

    dom=$(readenvelopes < $file)
 
    sub readenvelopes {
        local line, dom, rcptuser, rcpthost

	dom=()

>       while read line; do
>             case "$line" in
>               env-end)
> #               echo "The end!"
>                 break
>                 ;;
>               "to <"*)
> #               echo "T: $line";
>                 ssift "$line" in
>                 (.+)@(.+)>
>                   rcptuser="\1"
>                   rcpthost="\2"
> #                 echo "host: $rcpthost"
>                   lappend dom $rcpthost
>                   ;;
>                 tfiss
>                 ;;
>               *)
> #               echo "L: $line";
>                 ;;
>              esac
        done 
	returns $dom
    }
 
> This code reads the file with name in $file and for each line it tries to
> locate strings starting with "to <......>". For each one it adds the
> receipient to the receipients' list ($dom) with lappend command. It breaks
> at the end of the envelope (env-end marker).
> 
> Unfortunately, after processing several thousand messages this code broke.
> All router processes spit out regularly the following message: 
> 
> [19654] router[19654] realloc(4282384384): virtual memory exceeded, sleeping
> 
> 
> and of course the router queue is not processed. I've played with the gdb
> and found out that the "read line" part of the code is responsible (within
> sh_read() function). The "read" tries to allocate about 4GB of RAM, no
> wonder if fails to do so. So there is probably some bug in zmsh with respect
> to this code, since the message (even whole) is obviously much more smaller
> (max 4MB, actually). 

bc
4282384384-2^32
-12582912
scale=3
(4282384384-2^32)/1024/1024
-12.000

Odd, negative 12 megabytes .. 

I have a reason to believe that initial allocation size has been 8192-24
bytes, and buffer size has since then be doubled...  which means:

4282384384/(8192-24)       
524288.000

524288/2^19
1.000

It has been doubled 19 times.  Last two sizes have thus been:

1070596096  (~ 1G)  (this may or may not succeed in most systems)
2141192192  (~ 2G)  (amazing that this size has succeeded !)

Something is evidently wrong in  libsh/builtins.c: sh_read()
It is also rather difficult to understand code, which speaks for
need to rewrite in in  understood,  and bugfree manner..

> Haven't been able to locate the message in question, though. After
> restarting the router everything went on smoothly, and router processed all
> queued messages without interruption. Tricky.
> 
> Anyway, it occured to me that maybe such a loop is not an effective way to
> read in all the receipients from the envelope. Despite the fact that it
> breaks, I guess such construct "while read line; do ...; done < $file" is
> not a very efficient way of handling the file IO, it probably reads the
> whole file and allocates a lot of memory to read the file in one go, even
> though I call "break" immediately after the envelope's end. 

I am not sure of that.  Possibly not.
There are layers upon layers of IO redirections, and wrappings of
familiar looking functions.  Sometimes I think that is Bad Thing,
other times it is Good Thing...

> Is there any other way to get the receipients addresses _before_ I call
> rfc822() function? This code is executed in process() before calling the
> rfc822(), so all I have is $file with file name to process. Since it is
> called for every message, no forks can be involved, so calling external
> program is out of question. I could write my own built-in function, but
> maybe there is some easier way...

No,  the   router/rfc822.c: rfc822() -> sequencer()  does open the file,
and read it.  Then it starts processing source, and recipient addresses
one at the time.

Look for this in the source file:

		for (a = h->h_contents.a; a != NULL; a = a->a_next) {
/*ROUTER*/		l = router(a, def_uid, "recipient", senderstr);
			if (l == NULL)
				continue;

(That router() function is defined in  router/shliaise.c  file.)

That call will in the end use   standard.cf's  router()  script
function.

You may, possibly, want to create some pre-route entrypoint, which
in your case will do things, but in most cases is just void.


> Any ideas?
> 
> Thanks,
> Marek

-- 
/Matti Aarnio	<mea@nic.funet.fi>
-
To unsubscribe from this list: send the line "unsubscribe zmailer" in
the body of a message to majordomo@nic.funet.fi