[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Fwd: ccNotes SMTP problem explained...

A warstory from the trenches...

  We installed new software version, and somewhen along from 2.99.50-s5
  to 2.99.51-patch1pre the buffering of TCP fragment writeouts got
  changed -- well, propably when I introduced customizable 220 headers
  awhile back..   When that change happened, outgoing initial greeting
  begun to go out in multiple fragments at Solaris 2.5.1, and all the
  hell broke loose...

The patch is in my sources, and I guess I have to publish it properly
before heading out to .HU country for the next week.
(There is this astronomical event on 11th of August, and the trip was

/Matti Aarnio


Symptom was that after software version change at Sonera's major SMTP
relays the ccNotes gateway didn't anymore think that the server was
fully operational.

The problem didn't get solved before I got back to office to run
network snooper during the ccNotes gateway "test" function execution,
and (apparently) only I had sufficiently crooked mindset to guess
correctly, what goes on...

Here is what the network traffic looked like:

(Customer <<->> Sonera: TCP SYN, etc. normal stream open, detailed
	IP/TCP packet acknowledgement traffic omitted for brevity..)

(These are the ORDER where the packets appeared at the network snooper,
 and some guesses at the reasons why they were treated as they were..)

Sonera ->> Customer:   "220 smtp.tele.fi "

       ccNotes-GW: "Great, 220 reply, I send my greeting there"

       Do note that this packet fragment DOES NOT contain CRLF at
       the end, thus proper receiver must read more from the stream.
       More data is coming, but not in this same packet!

       It was propably written with several  write()  calls to the
       socket in UNIX application, and TCP NAGLE algorithm didn't
       delay the first fragment sufficiently to get rest of them
       into the same packet!

Customer ->> Sonera:   "HELO smtpgate.Customer\r\n"

       ... this was sent a bit too early ...

       ... of course, many systems try to speed things up by writing
       that greeting to the socket stream right after getting connect
       of the socket to succeed.  However they usually also handle
       rules of reading protocol responses right.

Sonera ->> Customer:   "rest of the initial greeting line\r\n"

       This is the slightly delayed end fragment.

       ccNotes-GW: "Err, that doesn't look right response :-("

Sonera ->> Customer:   "250 smtp.tele.fi Hello smtpgate.Customer\r\n"

The end result of the initial wrong method of reading incoming SMTP
protocol lines was that  ccNotes-GW got out of the sync with regards
of the real SMTP protocol session.

These fallacies are seen every now and then when programmers are testing
in environments which always send messages in single TCP frames.

Especially Windows WinSock API is extremely vulnerable for that type
of mistakes!

I would very much like to advocate for reusing one single function
for *all* protocol response readings.  A function which does proper
job at handling multiline responses, and multiple TCP fragments.
( See RFC 821, appendix E: Theory of Reply Codes )

We "cured" this by getting quick patch from the software vendor[*]
which patch does UNIX libc   setvbuf(outstream, NULL, _IOFBF, 8192)
for that protocol response stream at the start of the server process.
That patch aims for sending fewer fragments, but also making split
fragments far rarer, than previously.  ( But it *will not* eliminate
them always -- well, you don't use SMTP PIPELINING, thus for
ccNotes-GW clients it will likely eliminate them. )

Oh yes, ccNotes-GW's anti-relay facilities fall short at following
recipient envelope address:

	RCPT TO:<anybody%anywhere@local.host.name>

that relays thru just fine...

/Matti Aarnio <matti.aarnio@sonera.fi>

   [*] The MTA in question is ZMailer, see:  www.zmailer.org