CHARSET-CONVERSION mapping table

From Messaging Server Technical Reference Wiki
Jump to: navigation, search


The CHARSET-CONVERSION mapping table specifies what sorts of channel-to-channel character set conversions and message reformatting should be done. As suggested by the mapping name, character set conversion is its primary purpose.

The CHARSET-CONVERSION mapping can also be used to alter the format of messages. Facilities are provided to convert a number of non-MIME formats into MIME. Changes to MIME encodings and structure are also possible. These options are used when messages are being relayed to systems that only support MIME or some subset of MIME. And finally, conversion from MIME into non-MIME formats is provided in a small number of cases.

As of MS 8.0.2.1, enabling charset conversions also checks for and attempts to mitigate the various so-called MailSploit attacks. More specifically, if charset conversions are enabled for a given destination, the MTA will:

  1. Check the content of and remove unnecessary encoded-words during submission. In particular, encoded-words consisting of nothing but an atom or domain will be decoded. Note that this happens only when using SUBMIT, which should always be before DKIM or similar protection mechanisms are applied.
  2. Remove control characters other than those needed for MIME-compatible charsets. This includes, but is not limited to NUL, CR and LF.
  3. Remove any empty encoded-words that result from (2).
  4. Recognize and process encoded-words outside the contexts where they normally appear. (This behavior is appropriate for an MTA seeking to mitigate attacks that may depend on broken clients that recognize encoded-words outside the contexts where they normally occur.)

Note that the removal of NULs and similar CTLs is justified from a standards perspective since RFC 2047 requires that encoded words represent printable material in a MIME text/US-ASCII compatible charset. As such, this material should not be present in standards-compliant encoded-words.

The MTA will probe the CHARSET-CONVERSION mapping table in two different ways. The first probe is used to determine whether or not the MTA should reformat (or at least process) the message and if so, what formatting options should be used. (If no reformatting is specified, then the MTA does not bother to check for specific character set conversions.) The input string for this first probe has (by default) the general form:


IN-CHAN=in-channel;OUT-CHAN=out-channel;CONVERT 

Here in-channel is the name of the source channel (where the message comes from) and out-channel is the name of the destination channel (where the message is going). New in MS 6.3, setting bit 0 (value 1) of the include_conversiontag MTA option will cause this first probe to instead have the form


IN-CHAN=in-channel;OUT-CHAN=out-channel;TAG=tag-list;CONVERT 

where tag-list is a comma-separated list of any conversion tags present on the message.

If the probe matches the pattern (left hand side) of a CHARSET-CONVERSION mapping table entry, then the resulting string (right hand side of the mapping entry) should be a comma-separated list of keywords. The following keywords are provided:

CHARSET-CONVERSION mapping keywords
Keyword Action
Always Enable conversion
Appledouble Convert other MacMIME formats to Appledouble format
Applesingle Convert other MacMIME formats to Applesingle format
BASE64 Switch MIME encodings to BASE64
Binhex Convert other MacMIME formats, or parts including Macintosh type and Mac creator information, to Binhex format
Block Extract just the data fork from MacMIME format parts
Bottom "Flatten" any message/rfc822 body part (forwarded message) into a message content part and a header part
Delete "Flatten" any message/rfc822 body part (forwarded message) into a message content part, deleting the forwarded headers
Level Remove redundant multipart levels from message
Macbinary Convert other MacMIME formats, or parts including Macintosh type and Macintosh creator information, to Macbinary format
No Disable conversion
Pathworks Convert message to Pathworks Mail format
QUOTED-PRINTABLE Switch MIME encodings to QUOTED-PRINTABLE
Record,Text Line wrap text/plain parts at 80 characters
Record,Text=n Line wrap text/plain parts at n characters
RFC1154 Convert message to RFC 1154 format
Thurman Convert some non-standard "attachments" to MIME format
Top "Flatten" any message/rfc822 body part (forwarded message) into a header part and a message content part
UUENCODE Switch MIME encodings to X-UUENCODE
Yes Enable conversion

If a match is found, the MTA will perform any requested message reformatting, discussed further in Message reformatting, and for text parts, also check whether charset conversion is desired, as discussed further in Character set conversion. A No is assumed if no match occurs.

If the message has a conversion tag set, note that the "T" flag will be set, and this can be tested for (when a match on the pattern, i.e., left hand side, occurred) using a $:T test in the template (right hand side) output string. Such tests are more commonly used in the CONVERSIONS mapping table, but under less common conditions may potentially be useful here in the CHARSET-CONVERSION mapping table.


See also: