[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Reworking on smtpserver subsystems..

To: Matti Aarnio <mea@nic.funet.fi>
Subject: Re: Reworking on smtpserver subsystems..
From: Eugene Crosser <crosser@rol.ru>
Date: Mon, 05 Apr 2004 12:46:15 +0400
Cc: zmailer@nic.funet.fi
In-Reply-To: <20040402230422.Z29421@nic.funet.fi>
Organization: Sovintel
Original-Recipient: rfc822;zmailer-log@nic.funet.fi
References: <20040401200023.W29421@nic.funet.fi> <1080900412.6647.49.camel@ariel.sovam.com> <20040402230422.Z29421@nic.funet.fi>
Sender: zmailer-owner@nic.funet.fi

On Sat, 2004-04-03 at 00:04, Matti Aarnio wrote:
> On Fri, Apr 02, 2004 at 02:06:53PM +0400, Eugene Crosser wrote:
> > My thoughts on this matter are as follows:
> > 
> > - Forking model should be Apache-like, with preforked processes that can

> Transport agents are almost like that, there is no upper bound in

First, a couple of notes: my sentiments in this message as well as in
the previous one are more about "how I would like to see it in ideal
world" than "how I suggest to do it" ;-)  Then, some of them exactly
match the current status of things.  That's why I like Zmailer, after
all :-)

What we have now are managed server farms that look for job on their own
(router farm), "one session - one process" farms that accept jobs on
sockets (smtpserver), farms where process management and job management
are performed by a single entity (scheduler managing the transport
agents farm).

What I think would make sense is a uniform model for server farms
(Apache like), all of them getting job requests on (tcp or unix domain)
sockets.  There may be a router farm, a contentfilter farm, and possily
a number of transport agent farms (e.g. one farm per channel).  Thus the
scheduler (which probably has to be a single process) will not have to
manage process starting and terminating, and concentrate on queue
management.  On the other hand, farm managers could be very simple
multiplexors, in most cases doing nothing but wait() and occationaly
fork() (because on most systems even multiplexing is not necessary, the
childs just listen() on the same socket and the OS dispatch them as
appropriate).

Now, smtpserver will inject jobs to the router farm over the socket, the
router pass it to the scheduler, again, over another socket, and
smtpserver pass it to the transport agent much like it does now *but*
over the socket rather than pipe.

On reboot/restart, a separate ("run once") process can traverse queue
directories and inject old jobs emulating smtpserver for the router, and
another one emulating router for the scheduler.

You are quite right that there is an issue with permissions handling of
unix domain sockets on some systems (all SysVs maybe?), but there is an
easy workaround: create them in a directory with appropriate
permissions.

This approach also minimizes the "fd space pollution".  The only
bottleneck continues to be the scheluler, but even that will need half
as many descriptors as it uses now.

The smtpserver is somewhat special because it accepts real data on the
socket, but it could still benefit from the farm approach.  Even if the
current code does not allow process reuse (due to lack of object
destructors and otherwise unclean memory management), it *may* still
make sense to keep some preforked processes (or may not...)

> Present "1 message, 1 file" (or two files after router) has benefits,
> but commercial systems use also approaches, where the queued messages
> are inside a database..

If sockets are used to pass queue object handles, then the current file
approach is quite easily extended to database approach, only interface
functions will change, not the "business logic" code.  Having queue
objects in a database would (theoretically) allow to keep queue on one
physical server, smtpserver on another, router on the third, etc.  All
of them would communicate over tcp sockets and access actual message
data via particular database API.

> Anyway, such is quite far from "minimum technology needed" approach that
> ZMailer has used so far.

To me, it even looks like simplification - use one technology for all
communication...  Of course sockets are one step "higher tech" than
pipes, but who has socket-less systems nowdays?

> socketpair(AF_xxx, SOCK_STREAM, ..),  or SysV bidirectional pipes ?
> Or just two pipe(2)s ?
> 
> They sure are the most portable, but it isn't the first time that
> we do things "under the hood" in different ways in some situations.
> See for example  scheduler/pipes.c   :-)

I vote for named sockets, possibly upgradable to tcp sockets in future
(and separation of process creation from passing job requests, see
above).

Eugene

This is a digitally signed message part

Follow-Ups:
- Re: Reworking on smtpserver subsystems..
  - From: Eugene Crosser <crosser@rol.ru> (Mon, 5 Apr 2004 11:50:14 +0300)
- Re: Reworking on smtpserver subsystems..
  - From: Matti Aarnio <mea@nic.funet.fi> (Mon, 5 Apr 2004 18:49:55 +0300)

References:
- Reworking on smtpserver subsystems..
  - From: Matti Aarnio <mea@nic.funet.fi>
- Re: Reworking on smtpserver subsystems..
  - From: Eugene Crosser <crosser@rol.ru>
- Re: Reworking on smtpserver subsystems..
  - From: Matti Aarnio <mea@nic.funet.fi>

Prev by Date: Re: Scheduling algorithm?
Next by Date: Re: Reworking on smtpserver subsystems..
Prev by thread: Re: Reworking on smtpserver subsystems..
Next by thread: Re: Reworking on smtpserver subsystems..
Index(es):
- Date
- Thread