[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ldap problem



Hi there...

noone answered this, and I'm in real trouble :-(

Despite the strange setup, there are two problems:

1) whenever the ldap server fails, so the next time router tries to 
search fails, router _STAYS_ that way... that is, it doesn't seem to try 
to disconnect and reconnect.

As you may already know, I'm rather code-blind, so I can't quite follow 
the logic behind router/libdb/ldap.c ... I'd like to modify it so that, 
when a search fails, free the resources and, the next time it is 
requested, it tries to connect and bind from scratch... 

Only solution so far, has been to manually restar router, and I don't 
like to relay on that.

Anyone can throw a bit of light? (I'm running latest CVS from late 
November).

2) The second problem is related to the hold channel. When router fails 
(i.e. because ldap is unavailable), the messages passing thru, all go to 
the  'hold/io:error' channel... even when ldap service comes back on 
line, as router can't seem to reconnect, it keeps feeding the 
hold/io:error channel... and it never drains.

But even if I _do_ restart router, the messages never seem to even try to 
get out of hold/io:error

I tried to do
manual-rerouter -c hold -h io:error
or restarting the scheduler, also, but everything's stuck in there.

Why do this happen? How can I prevent it? or at least remedy it once it 
happens?

I'd really appreciate any hint about this.

Regards.


El 12 Feb 2004 a las 20:15, Mariano Absatz escribió:

> Hi Matti,
> 
> I have a problem that I think is related to the way router handles ldap 
> database relations.
> 
> I created a new router 'protocol' to handle all local deliveries thru 
> ldap. The point is that local 'usernames' include the '@', so I use ldap 
> to know if the domain is local, and, if it is, then handle the message to 
> maildrop ( http://www.flounder.net/~mrsam/maildrop/ ) using the full 
> address so it, in turns, finds out where to store the message using a 
> second ldap query.
> 
> The complete circuit, when everything is fine, works OK. The problem 
> arises when the ldap server goes down.
> 
> I'll show you how I configured things: I'm using ZMailer 2.99.56 
> patch1pre taken from CVS on 2003-11-30 (I think there where no changes 
> since then), running on RedHat Linux ES 2.1 (similar to 7.2). Maildrop is 
> version 1.6.3.
> 
> First I create an ldap database creating a file $MAILVAR/db/maildrop.ldap 
> with the following content:
> ##########################################################################
> #
>  #$MAILVAR/db/maildrop.ldap
>  base     ou=Servicio mail domain,ou=Servicios,o=Pert Consultores
>  ldaphost ldap.pert.com.ar
>  ldapport 389
>  binddn   cn=admin,ou=Servicios,o=Pert Consultores
>  passwd   secret
>  filter   
> (&(objectclass=pertMailDomain)(pertMailDomainRouting=local)(associatedDoma
> in=%s))
>  attr     associatedDomain
>  scope    sub
> ##########################################################################
> #
> Note: the filter is all in one line.
> 
> Then I create a maildrop initialization file in $MAILVAR/cf/i-maildrop.cf 
> creating the relation:
> ##########################################################################
> #
>  provide maildropldap
>  if [ -f $MAILVAR/db/maildropldap.zmsh ]; then
>      . $MAILVAR/db/maildropldap.zmsh
>  else
>      if [ -f $MAILVAR/db/maildrop.ldap ]; then
>      relation -lmt ldap -s 2000 -e 20 \
>           -f $MAILVAR/db/maildrop.ldap maildropldap
>      else
>      fqdnaliasesldap () { return 1 }
>      fi
>  fi
> ##########################################################################
> #
> 
> And then, the maildrop processing file in $MAILVAR/cf/p-maildrop.cf
> ##########################################################################
> #
>  require maildropldap crossbar
>  provide maildrop
>  maildrop_neighbour (domain, address, A) {
>      local associateddomain
>      associateddomain=$(maildropldap "$domain") &&
>          return (((maildrop "$domain" "$address" $A)))
>      return 1
>  }
> ##########################################################################
> #
> (I don't actually _use_ the associatedDomain I get, I only use the _fact_ 
> that I found it so I can return the 'maildrop' channel and avoid further 
> domain processing).
> 
> Now I create the maildrop channel in $MAILVAR/scheduler.conf like:
> ##########################################################################
> #
>  maildrop/*
>         interval=5m
>         idlemax=9m
>         expiry=3d
>         maxchannel=15
>         maxring=5
>         command="sm -8c $channel maildrop"
> ##########################################################################
> #
> 
> and an entry in $MAILVAR/sm.conf to actually call maildrop:
> ##########################################################################
> #
>  maildrop  SsPO  /app/maildrop/bin/maildrop   maildrop -d $u -w 90
> ##########################################################################
> #
> (the -d option followed by the username sets maildrop into delivery mode, 
> it is configured to search the username in the ldap directory).
> 
> I finally edit the protocols line in $MAILVAR/router.cf like:
> ##########################################################################
> #
>  protocols=('maildrop' 'routes' 'smtp')
> ##########################################################################
> #
> 
> Now, everything works fine, router does its ldap query and if the host 
> part is in there, it sets the channel to maildrop, and maildrop delivers 
> the message OK.
> 
> But... if after a while, I stop the ldap server, when a new message 
> arrive, it is deferred... forever. It stays in the 'hold/io:error' 
> channel and never seems to try to get out of there.
> 
> Here's a normal delivery in the router log:
> ##########################################################################
> #
> [14247] <S1387216AbUBLV4d/20040212215633Z+1@mx2.pert.com.ar>: file: 
> 1387216 <baby@baby.com.ar> =>  <test1@pert.com.ar>
> [14247] <S1387216AbUBLV4d/20040212215633Z+1@mx2.pert.com.ar>: address: 
> test1@pert.com.ar
> [14247] <S1387216AbUBLV4d/20040212215633Z+1@mx2.pert.com.ar>: infopert: 
> 2004:02:12:18:56:33 
> FROM:baby@baby.com.ar,TO:test1@pert.com.ar,DELIVER:[[maildrop,pert.com.ar,
> test1@pert.com.ar]],Sz:72,BdSz:37,TS:1076622993,Dy:0,Rcv:fe1.pert.com.ar 
> ([10.6.1.110]:54483 "helo che")
> ##########################################################################
> #
> 
> (the 'infopert:' line was added by me in crossbar.cf)
> 
> syslog shows the following:
> ##########################################################################
> #
> Feb 12 18:56:33 fe2 router[14247]: S1387216AbUBLV4d: 
> from=<baby@baby.com.ar>, rrelay=fe1.pert.com.ar ([10.6.1.110]:54483 "helo 
> che"), size=443, nrcpts=1, 
> msgid=<S1387216AbUBLV4d/20040212215633Z+1@mx2.pert.com.ar>, 
> delay=00:00:00, xdelay=00:00:00
> Feb 12 18:56:33 fe2 sm[14254]: S1387216AbUBLV4d: to=<test1@pert.com.ar>, 
> delay=00:00:00, xdelay=00:00:00, mailer=maildrop, stat=ok3 
> ##########################################################################
> #
> 
> now I stop ldap server and then send another message (to the same 
> address), here's the router log:
> ##########################################################################
> #
> [14247] <S1387216AbUBLV7M/20040212215912Z+2@mx2.pert.com.ar>: address: 
> baby@baby.com.ar
> [14247] search_ldap: ldap_search_s error!
> [14247] <S1387216AbUBLV7M/20040212215912Z+2@mx2.pert.com.ar>: deferred: 
> IO:error: baby@baby.com.ar
> [14247] <S1387216AbUBLV7M/20040212215912Z+2@mx2.pert.com.ar>: file: 
> 1387216 <baby@baby.com.ar> =>  <test1@pert.com.ar>
> [14247] <S1387216AbUBLV7M/20040212215912Z+2@mx2.pert.com.ar>: address: 
> test1@pert.com.ar
> [14247] search_ldap: ldap_search_s error!
> [14247] <S1387216AbUBLV7M/20040212215912Z+2@mx2.pert.com.ar>: deferred: 
> IO:error: test1@pert.com.ar
> [14247] <S1387216AbUBLV7M/20040212215912Z+2@mx2.pert.com.ar>: infopert: 
> 2004:02:12:18:59:12 FROM:baby@baby.com.ar,TO:test1@mill
> ic.com.ar,DELIVER:[[hold,IO:error,test1@pert.com.ar]],Sz:120,BdSz:80,TS:10
> 76623152,Dy:0,Rcv:fe1.pert.com.ar ([10.6.1.110]:54739 "helo che")
> ##########################################################################
> #
> 
> and the syslog:
> ##########################################################################
> #
> Feb 12 18:59:12 fe2 router[14247]: S1387216AbUBLV7M: 
> from=<baby@baby.com.ar>, rrelay=fe1.pert.com.ar ([10.6.1.110]:54739 "helo 
> che"), size=491, nrcpts=1, 
> msgid=<S1387216AbUBLV7M/20040212215912Z+2@mx2.pert.com.ar>, 
> delay=00:00:00, xdelay=00:00:00
> Feb 12 18:59:12 fe2 hold[14296]: S1387216AbUBLV7M: to=<deferred>, 
> delay=00:00:00, xdelay=00:00:00, mailer=hold, stat=deferred deferred
> ##########################################################################
> #
> 
> 
> router gets an error tryin' to read the directory:
>  search_ldap: ldap_search_s error!
> and defers the message thru the 'hold:IO:error' channel.
> 
> But the message stays on that channel forever, I see no activity there. I 
> shortened the retries in hold to see what happens, but nothing ever 
> changes.
> 
> I restarted router & scheduler to no avail. I stopped zmailer completely 
> and started it again and everything is the same... here's a couple more 
> syslog lines an hour after:
> ##########################################################################
> #
> Feb 12 20:08:22 fe2 hold[14714]: S1387216AbUBLV7M: to=<deferred>, 
> delay=01:09:10, xdelay=00:00:00, mailer=hold, stat=deferred deferred
> Feb 12 20:08:52 fe2 hold[14714]: S1387216AbUBLV7M: to=<deferred>, 
> delay=01:09:40, xdelay=00:00:00, mailer=hold, stat=deferred deferred
> ##########################################################################
> #
> 
> FYI, the hold channel is configured in $MAILVAR/scheduler.conf like:
> ##########################################################################
> #
> hold/*
>     #interval=5m
>     interval=30s
>     retries="1 1"
>     maxchannel=1
>     command=hold
> ##########################################################################
> #
> 
> and the message's transport control file looks like:
> ##########################################################################
> #
> @ 0x00000007
> i 1387216
> o 411
> l <S1387216AbUBLV7M/20040212215912Z+2@mx2.pert.com.ar>
> e <baby@baby.com.ar>
> e baby@baby.com.ar
> s local baby@baby.com.ar baby@baby.com.ar 2
> r           hold IO:error test1@pert.com.ar 99
> N NOTIFY=FAILURE,DELAY ORCPT=rfc822;test1@pert.com.ar
>       INRCPT=rfc822;test1@pert.com.ar INFROM=rfc822;baby@baby.com.ar
> m
> Received: from fe1.pert.com.ar ([10.6.1.110]:54739 "helo che")
>         by mx2.pert.com.ar with SMTP id S1387216AbUBLV7M;
>         Thu, 12 Feb 2004 18:59:12 -0300
> From:   <baby@baby.com.ar>
> Subject: baje el ldap (entra en loop?)
> Message-Id: <S1387216AbUBLV7M/20040212215912Z+2@mx2.pert.com.ar>
> To:     unlisted-recipients:; (no To-header on input)
> Date:   Thu, 12 Feb 2004 18:59:12 -0300
> ##########################################################################
> #
> 
> Why it never gets out of 'hold'? according to 'man hold', "io succeeds 
> 10% of the time, to allow retry of temporary I/O failures"... but if it 
> is retrying every 30secs, in 3 minutes it should have been retried.
> 
> I first thought that maybe router wasn't retrying the ldap connect/bind 
> after the failure, but completely restarting zmailer should have took 
> charge of that.
> 
> Is this an error in my configuration?
> 
> Any thoughs?
> 
> TIA

--
Mariano Absatz
El Baby
----------------------------------------------------------
Why should I care about posterity?
What's posterity ever done for me?
      -- Groucho Marx


-
To unsubscribe from this list: send the line "unsubscribe zmailer" in
the body of a message to majordomo@nic.funet.fi