From ZaInternetHistory
R H O D E S U N I V E R S I T Y
---------------------------------
C O M P U T I N G C E N T R E
-------------------------------
On the Issue of What to Do to Message Headers
---------------------------------------------
There is some confusion on how to process message headers. The
problem lies in the program on the RURES Cyber that processes
RSCS relay mail for forwarding to Fidonet. Just how exactly
should the following message be forwarded? What must it look
like? What are the issues involved?
Assume that we are dealing with a message like this:-
<-- Contents of message --------------------....->
1 Date: Wed, 04 Jan 89 09:22:20 +0200 (SAST)
2 From: <CCFJ@RUPHYS>
3 To: Jim%MIT@rures
4 Subject: TEST
5
6 TEST
7
Should it be
Option 1:-
--------
<-- Contents of message --------------------....->
1 To: Jim@MIT
2 From: <CCFJ@RUPHYS>
3 Subj: TEST
4
5 Return-path: <CCFJ@RUPLA>
6 Received: Wed, 04 Jan 89 09:23:24 +0200 (SAST)
7 Date: Wed, 04 Jan 89 09:22:20 +0200 (SAST)
8 From: <CCFJ@RUPLA>
9 To: Jim%MIT@rures
10 Subject: TEST
11
12 TEST
13
or
Option 2:-
--------
<-- Contents of message --------------------....->
1 Return-path: <CCFJ@RUPLA>
2 Received: Wed, 04 Jan 89 09:23:24 +0200 (SAST)
3 Date: Wed, 04 Jan 89 09:22:20 +0200 (SAST)
4 From: <CCFJ@RUPLA>
5 To: Jim@MIT
6 Subject: TEST
7
8 TEST
9
or what?
What does RFC 822 specify?
Extract from RFC 822
--------------------
The following subset of syntax rules has been extracted from RFC
822, Appendix D. It has been resorted to be in a better(?) or
clearer(?) order, rather than just alphabetic.
message = fields *( CRLF *text ) ; Everything after
; first null line
; is message body
fields = dates ; Creation time,
source ; author id & one
1*destination ; address required
*optional-field ; others optional
dates = orig-date ; Original
[ resent-date ] ; Forwarded
source = [ trace ] ; net traversals
originator ; original mail
[ resent ] ; forwarded
trace = return ; path to sender
1*received ; receipt tags
return = "Return-path" ":" route-addr ; return address
received = "Received" ":" ; one per relay
["from" domain] ; sending host
["by" domain] ; receiving host
["via" atom] ; physical path
*("with" atom) ; link/mail protocol
["id" msg-id] ; receiver msg id
["for" addr-spec] ; initial form
";" date-time ; time received
originator = authentic ; authenticated addr
[ "Reply-To" ":" 1#address] )
authentic = "From" ":" mailbox ; Single author
/ ( "Sender" ":" mailbox ; Actual submittor
"From" ":" 1#mailbox) ; Multiple authors
; or not sender
There are several things of note.
The Term 'header'
-----------------
The term 'header' is not mentioned; what we loosely call a
header is in fact 'fields'. The syntax of 'fields' is given
above. However, because the term 'header' is so ingrained in my
thinking, I will continue to use it, but it is an exact synonym
for 'fields'.
Expanded 'message' Syntax
-------------------------
Expanding the syntax for 'fields' (ie the header), an leaving
out some definitions of minor interest solely for the purpose of
clarity, we get:-
message = orig-date
[ return 1*received ]
originator
1*destination
*optional-field
*( CRLF *text )
Interpreting this syntax, we get
"A message starts with a mandatory date-timestamp line showing
when it was created. There shall be exactly one such line.
"This may be followed by a single return path and any number of
received lines, but if a return path is present then there must
be at least one received line. Also, given the presence of at
least one received line, there shall be exactly one return path.
"There shall be exactly one From field, possibly preceded by a
Sender field as well.
"There shall be at least one destination.
"There may be any or no optional fields.
"There shall then be a blank line, followed by any or no
text"
What is RFC 822 Defining?
------------------------
As with most RFCs, there is no attempt to force any internal
operation or structure onto any computer system. Here is an
extract from page 1 of RFC 822:-
(NB. My capitals, also my reformatting, but the concept is from
RFC 822)
(NB further. The CONTENTS are, in fact, the message as defined
in the syntax above)
"... messages are viewed as having an ENVELOPE and CONTENTS.
The ENVELOPE contains whatever information is needed to
accomplish transmission and delivery. The CONTENTS compose the
object to be delivered to the recipient. THIS STANDARD APPLIES
ONLY TO THE FORMAT AND SOME OF THE SEMANTICS OF MESSAGE
CONTENTS. It contains no specification of the information in
the ENVELOPE."
My interpretation of this, in term of any protocol that moves
mail, is that the envelope will a vary drastically with the
protocol, but the contents will be very much the same. Sure,
details of Received: and Return-path: will differ, but
precious little else. No extra header lines will appear simply
because a differing protocol moves the mail.
Now let's look at the next paragraph that follows the one quoted
above:-
"However, some message systems may use information from the
CONTENTS to create the ENVELOPE. It is intended that this
standard facilitate the acquisition of such information by
programs."
Given that the envelope is not part of the message (for message
syntax, see above) and given that the contents are the same (or
very nearly so) regardless of the transport protocol, it seems
clear that any program that feeds a message to the protocol
prior to transmission must form the envelope, and it can do this
by looking at the contents.
In the case of our Cyber mail generator, we could feasibly
generate the NJROUTE command by not looking at the contents of
the message, but in principle the single To: response serves
the purpose of generating the envelope as well as going directly
into the contents, as part of the header line, so this is
splitting hairs.
Regarding the difference between how a message is stored, and
how it is transmitted, RFC 822 follows the above paragraph with
this one:-
"Some message systems may store messages in formats that differ
from the one specified in this standard. This specification is
intended strictly as a definition of what message CONTENT FORMAT
is to be passed BETWEEN hosts."
Note carefully. Store messages in any form that is acceptable
to the host computer. Display them to users in any way that you
like. Destroy them, sort them, hash them, that is a site
decision. BUT TRANSMIT THEM IN THE FORMAT DEFINED BY RFC 822.
To bring this point home, RFC 822 follows on with:-
"Note: This standard is NOT intended to dictate the internal
formats used by sites, the specific message system features that
they are expected to support, or any of the characteristics of
user interface programs that create or read messages."
Conclusion
----------
To get back to the problem, what to do with message headers, it
should be clear that the problem of what is displayed and the
problem of what message is interchanged are two separate
problems. The internal (ie within the host computer) form may
be anything that the host site decides. The interchange form is
clearly spelt out in RFC 822.
Hence, it seems to be sensible to
a) display and/or deliver the message according to option 1, or
in any other way that we will find acceptable to the users of
the Cyber. Add what we choose, remove what we choose, mess up
what we choose, but make sure that users find the results
acceptable,
b) forward it as per option 2
Mike Lawrie
12 Jan 1989.