[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: spamassassin



Hi Mariano,

Ah, I see your predicament. Extra disk IO is not desired. I suppose the
only time you'd want invoke a tagger would be in a pipe as the message
comes in on disk (ie $POSTOFFICE/public) for the first time, long before
the current contentfilter would be invoked.

[ what happens in my filter is that bad messages get rewritten to another
directory and then the original is deleted - not good for high volume... ]

In my experience, spamassassin typically takes about 1 sec/message,
so you'd be down to about 86400 messages/day instead of 2M - you'd need 
about 24 cpu's to handle that kind of transactional load. On really fast 
machines you might get that cut in half, so you'd still need on the order 
of 10 or so cpu's. [ I guess you'd need a "fast" version of spamassassin ]

-Jim

On Fri, 7 Nov 2003, Mariano Absatz wrote:

> Yes, I can "return ok;", but how & where do I tag?... the message passed 
> thru contentfilter, AFAIK, doesn't get back to the spool, it's just a 
> copy, or not?
> 
> El 6 Nov 2003 a las 14:59, James MacKinnon escribió:
> 
> > 
> > well, it would be easy to modify Eugene's contentfilter.c to
> > always exit with:
> > 
> > 	return ok;
> > 
> > 
> > ( rather than return begone; )
> > 
> > It sounds as if you just want a stub program there to call spamd
> > and rewite the headers with the 'X-Spam' info, and exit. If you always
> > exit(0) out of your stub, you won't be rejecting anything :-)
> > 
> > Cheers,
> > -Jim
> > 
> > On Thu, 6 Nov 2003, Mariano Absatz wrote:
> > 
> > > Yes... I thought about using Eugene's newer zmscanner (which I'll 
> > > probably use for antivirus w/clam-av), but, in this ISP environment I 
> > > can't afford to reject legit messages which show up as SpamAssassin false 
> > > positives, that is why I have to tag and deliver.
> > > 
> > > smptserver's contentfilter doesn't let me do that.
> > > 
> > > OTOH, I'm talking about 2M messages/day, rather than 100k msgs/week... I 
> > > can (and will) throw hardware at it, but not 10 dual xeon... :-)
> > > 
> > > El 6 Nov 2003 a las 14:20, James MacKinnon escribió:
> > > 
> > > > Hi Mariano,
> > > > 
> > > > I wouldn't try to do it via process.cf
> > > > 
> > > > Eugene at one time wrote a little insert for smtpserver called
> > > > 'lean-mean-contentfilter' (in contrib/ in the source).
> > > > 
> > > > I modified that a bit and use it directly in the smtpserver
> > > > (so spamd is invoked inline during initial message receipt, and the
> > > > result flags the contentfilter mechanism to accept/reject during
> > > > the SMTP chat). 
> > > > 
> > > > zmailer already has hooks set up for a 'contentfilter'
> > > > 
> > > > I have, in smtpserver.conf:
> > > > 
> > > > # External program for received message content analysis:
> > > > # my custom contentfilter is a hook to $MAILBIN/spamc.sh
> > > > PARAM  contentfilter    $MAILBIN/contentfilter
> > > > 
> > > > 
> > > > contentfilter really just passes a filename off to a shell script
> > > > which then invokes spamd. It sets exit on the result back to 
> > > > contentfilter (which expects exit values 0 and -1). 
> > > > 
> > > > Typical smtpserver log entry on a spam transaction:
> > > > 
> > > > ...
> > > > NFww30942r      MAIL From:<y5clzkuow@hotmail.com> SIZE=2867
> > > > NFww30942w      250 2.1.0 Sender syntax Ok
> > > > NFww30942#      -- pipeline input exists 6 bytes
> > > > NFww30942r      RCPT To:<vf889r6@phys.ualberta.ca>
> > > > NFww30942#      test-rcpt-dns-rbl test; rblmsg='<none>'
> > > > NFww30942w      250 2.1.5 Ok; can accomodate 2867 byte message for 
> > > >                 <vf889r6@phys.ualberta.ca>
> > > > NFww30942r      DATA
> > > > NFww30942w      354 Start mail input; end with <CRLF>.<CRLF>
> > > > NFww30942#      policyprogram said: -1 550 5.7.1 Content Policy rejection 
> > > >                 - not acceptable content
> > > > NFww30942#      Content-policy analysis ordered message rejection. 
> > > >                 (code=-1); msg='550 5.7.1 Content Policy rejection - not 
> > > >                 acceptable content'
> > > > NFww30942w      550 5.7.1 Content Policy rejection - not acceptable 
> > > >                 content
> > > > NFww30942r      QUIT
> > > > NFww30942w      221 2.0.0 relay.phys.ualberta.ca Out
> > > > 
> > > > 
> > > > It's been working very well now for just over a year here. My total 
> > > > smtp transaction volume is on the order of 100000 per week (I guess
> > > > that might be considered small), but the machine rarely sees a load
> > > > average greater than 0.5
> > > > 
> > > > You could very easily put your idea into place using the
> > > > existing contentfilter mechanism, and set it up to just tag
> > > > rather than reject.
> > > > 
> > > > Cheers,
> > > > -Jim
> > > > 
> > > > 
> > > > On Thu, 6 Nov 2003, Mariano Absatz wrote:
> > > > 
> > > > > I know, I know... every 2 months someone (like me) comes to the list 
> > > > > asking how to integrate spamassassin with zmailer...
> > > > > 
> > > > > I also know what Eugene will say: "spamassassin is waaaaay too slow to 
> > > > > handle any real traffic" :-)
> > > > > 
> > > > > However, I'm being asked to do AntiSpam tagging (not deleting) for a 
> > > > > relatively high volume ISP, and the only open tool I know is 
> > > > > spamassassin...
> > > > > 
> > > > > Situation is, I'm on a border smtp gateway with no users in it, just 
> > > > > accept, tag, and deliver.
> > > > > 
> > > > > I don't like the procmail approach... Eugene once said he did it in 
> > > > > cf/process.cf ( http://www.zmailer.org/mhalist/2003/msg00166.html ).
> > > > > 
> > > > > How would that be done?
> > > > > 
> > > > > I don't know how to zmsh, but I could write a small C filter that reads a 
> > > > > queue file from stdin, calls spamd using libspamc passing the original 
> > > > > message (sàns envelope) and, based on what spamd answers, adds a couple 
> > > > > of headers before writing it (with envelope) to standard output...
> > > > > 
> > > > > That should be invoked (if I understand correctly) just before calling 
> > > > > the rfc822 function... but can this be done without an intermediate file?
> > > > > 
> > > > > Otherwise, could it look like this (on cf/process.cf)?
> > > > > ========================<CUT>=============================
> > > > >         case "$file" in
> > > > > #       [0-9]*.x400)    x400 "$file" ;;
> > > > > #       [0-9]*.uucp)    uucpfilter "$file" > /tmp/X.$$
> > > > > #                       cat /tmp/X.$$ > "$file"
> > > > > #                       rfc822 "$file" ;;
> > > > >         [0-9]*)         /usr/local/bin/CheckSpam "$file" > chkspm."$file"
> > > > >                         /bin/rm -f "$file"
> > > > >                         rfc822 chkspm."$file" ;;
> > > > >         core*)          /bin/mv "$file" ../$file.router.$$
> > > > >                         return
> > > > >                         ;;
> > > > >         *)              /bin/mv "$file" ../postman/rtr."$file".$$
> > > > >                         return
> > > > >                         ;;
> > > > >         esac
> > > > > ========================<CUT>=============================
> > > > > 
> > > > > or maybe:
> > > > > ========================<CUT>=============================
> > > > >         case "$file" in
> > > > > #       [0-9]*.x400)    x400 "$file" ;;
> > > > > #       [0-9]*.uucp)    uucpfilter "$file" > /tmp/X.$$
> > > > > #                       cat /tmp/X.$$ > "$file"
> > > > > #                       rfc822 "$file" ;;
> > > > >         [0-9]*)         /usr/local/bin/CheckSpam "$file" > chkspm."$file"
> > > > >                         /bin/mv chkspm."$file" "$file"
> > > > >                         rfc822 "$file" ;;
> > > > >         core*)          /bin/mv "$file" ../$file.router.$$
> > > > >                         return
> > > > >                         ;;
> > > > >         *)              /bin/mv "$file" ../postman/rtr."$file".$$
> > > > >                         return
> > > > >                         ;;
> > > > >         esac
> > > > > ========================<CUT>=============================
> > > > > 
> > > > > is this correct?
> > > > > is it less unefficient than other methods?
> > > > > 
> > > > > TIA.
> > > > > 
> > > > > --
> > > > > Mariano Absatz
> > > > > El Baby
> > > > > ----------------------------------------------------------
> > > > > There's too much blood in my caffeine system.
> > > > > 
> > > > > 
> > > > > -
> > > > > To unsubscribe from this list: send the line "unsubscribe zmailer" in
> > > > > the body of a message to majordomo@nic.funet.fi
> > > > > 
> > > > 
> > > > -- 
> > > > James S. MacKinnon           Office: P-139 Avadh-Bhatia Physics Lab
> > > > Team Physics                 Voice : (780) 492-8226 [old AC 403]
> > > > University of Alberta        email : Jim.MacKinnon@Phys.UAlberta.CA
> > > > Edmonton, Canada T6G 2N5     WWW   : http://www.phys.ualberta.ca/
> > > > 
> > > > char*f="char*f=%c%s%c;main(){printf(f,34,f,34,10);}%c";main(){printf(f,34,f,34,10);}
> > > > for all that we know the universe could cease to exist at any mo
> > > 
> > > 
> > > --
> > > Mariano Absatz
> > > El Baby
> > > ----------------------------------------------------------
> > > Too much of a good thing can be wonderful.
> > >       -- Mae West
> > > 
> > > 
> > 
> > -- 
> > James S. MacKinnon           Office: P-139 Avadh-Bhatia Physics Lab
> > Team Physics                 Voice : (780) 492-8226 [old AC 403]
> > University of Alberta        email : Jim.MacKinnon@Phys.UAlberta.CA
> > Edmonton, Canada T6G 2N5     WWW   : http://www.phys.ualberta.ca/
> > 
> > char*f="char*f=%c%s%c;main(){printf(f,34,f,34,10);}%c";main(){printf(f,34,f,34,10);}
> > for all that we know the universe could cease to exist at any mo
> 
> 
> --
> Mariano Absatz
> El Baby
> ----------------------------------------------------------
> Beware of programmers with screwdrivers.
> 
> 

-- 
James S. MacKinnon           Office: P-139 Avadh-Bhatia Physics Lab
Team Physics                 Voice : (780) 492-8226 [old AC 403]
University of Alberta        email : Jim.MacKinnon@Phys.UAlberta.CA
Edmonton, Canada T6G 2N5     WWW   : http://www.phys.ualberta.ca/

char*f="char*f=%c%s%c;main(){printf(f,34,f,34,10);}%c";main(){printf(f,34,f,34,10);}
for all that we know the universe could cease to exist at any mo

-
To unsubscribe from this list: send the line "unsubscribe zmailer" in
the body of a message to majordomo@nic.funet.fi