[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: more from: rewrite lossage



> Zmailer is doing some Bad Things with long addresses in the From: line --
> to wit, I'm seeing lines like:
> 
> From:   User Name Here <very.very#long#address_with.weird.characters@host.
>         novell.com>
> 
> The line is being split in the middle of the address, clearly for cosmetic
> reasons.  The problem is, when someone replies to that address in Berkmail,
> the lines get joined with a space in between -- as in
> 
> very.very#long#address_with.weird.characters@host. novell.com
> 
> That address bounces quicker than you can say RFC822.

  No, it takes 10-20 seconds before router processes it -- assuming you use
Zmailer, and not sendfail..

> I'll probably have to fix our local Berkmail (and possibly a slew of other
> user agents), but does it occur to anyone else that such rewriting is
> probably not a great idea?  Is it even legal, in strict RFC822?

  It is legal -- in strict RFC822! -- but not recommended.
Hmm.. No, RFC-822 contradicts itself here a bit, see below:

---------- RFC 822 ------------
     3.1.1.  LONG HEADER FIELDS

        Each header field can be viewed as a single, logical  line  of
        ASCII  characters,  comprising  a field-name and a field-body.
        For convenience, the field-body  portion  of  this  conceptual
        entity  can be split into a multiple-line representation; this
        is called "folding".  The general rule is that wherever  there
        may  be  linear-white-space  (NOT  simply  LWSP-chars), a CRLF
        immediately followed by AT LEAST one LWSP-char may instead  be
        inserted.  Thus, the single line

            To:  "Joe & J. Harvey" <ddd @Org>, JJV @ BBN

        can be represented as:

            To:  "Joe & J. Harvey" <ddd @ Org>,
                    JJV@BBN

        and

            To:  "Joe & J. Harvey"
                            <ddd@ Org>, JJV
             @BBN

        and

            To:  "Joe &
             J. Harvey" <ddd @ Org>, JJV @ BBN

             The process of moving  from  this  folded   multiple-line
        representation  of a header field to its single line represen-
        tation is called "unfolding".  Unfolding  is  accomplished  by
        regarding   CRLF   immediately  followed  by  a  LWSP-char  as
        equivalent to the LWSP-char.

        Note:  While the standard  permits  folding  wherever  linear-
               white-space is permitted, it is recommended that struc-
               tured fields, such as those containing addresses, limit
               folding  to higher-level syntactic breaks.  For address
               fields, it  is  recommended  that  such  folding  occur
               between addresses, after the separating comma.
---------- and onwards ... --------


> (That is, is zmailer non-compliant here, or is Berkmail?  The latter is
>  much more likely, but..)

  Chapter 3.1.4 tells you more about why it is d*n difficult to make
a parser doing it right...

-----------
       So, for example, the folded body of an address field

            ":sysmail"@  Some-Group. Some-Org,
            Muhammed.(I am  the greatest) Ali @(the)Vegas.WBA
...
        The canonical representations for the data in these  addresses
        are the following strings:

                        ":sysmail"@Some-Group.Some-Org
        and
                            Muhammed.Ali@Vegas.WBA
-----------
however:
-----------
        Note:  For purposes of display, and when passing  such  struc-
               tured information to other systems, such as mail proto-
               col  services,  there  must  be  NO  linear-white-space
               between  <word>s  that are separated by period (".") or
               at-sign ("@") and exactly one SPACE between  all  other
               <word>s.  Also, headers should be in a folded form.
-----------

So I think it is time to look deeply into the router code...

	/Matti Aarnio <mea@utu.fi>