[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

zmailcheck patch

I recently had a situation where our ZMailer 2.99.55 server would
repeatedly stop delivering messages after a time and need a restart to
continue.  I eventually tracked the problem down to a stuck message in
the router.  After removing the message, which was spam anyway, the
router reverted to working reliably as usual.  I imagine this bug is
probably already fixed in a later ZMailer version, so this message is
really not about the router crashing message.

This situation did motivate me to deploy the "zmailcheck" script.
However, I found a couple of shortcomings which had to be fixed before
it could be useful here.  The first is that the script simply checks
that the .router.pid file exists and that there is a process running
with the PID in the file and, if both of these are true, that all is
well with the router.  I added an additional check to make sure that
at least 2 router processes are running.  The second problem I noticed
is that, although the script verifies that the router and the
scheduler are running, it does not attempt to fix the situation if
they are not.  I attempt to restart the router or the scheduler if
either one is not running.  After all, if one of them is not running,
then the "ZMailer Alert" message describing the problem does not get
delivered.  The patch is attached.

Roy Bixler <rcb@ucp.uchicago.edu>
The University of Chicago Press
--- zmailcheck.orig	1998-02-10 15:01:44.000000000 -0600
+++ zmailcheck	2004-08-26 09:15:21.000000000 -0500
@@ -15,14 +15,20 @@
     for i in router scheduler; do
 	if [ ! -f $POSTOFFICE/.pid.$i ]; then
 		REASON="$i dead. Pidfile $POSTOFFICE/.pid.$i missing"
+		$MAILBIN/zmailer $i
 		X=`cat $POSTOFFICE/.pid.$i`
 		if [ ! -d /proc/$X ]; then
 			REASON="$i dead. Process $X not running"
+			$MAILBIN/zmailer $i
     ps ax >/tmp/$$
+    if [ `grep $MAILBIN/router /tmp/$$ |wc -l` -le 2 ]; then
+		REASON="Router only partially running"
+		$MAILBIN/zmailer router
+    fi
     if ! grep -q smtpserver /tmp/$$; then
 		REASON="No smtpserver running"