Charset7, charset8, charsetesc Channel Options

Automatic character set labelling
The MIME specification provides a mechanism to label the charset used in plain text messages: A "charset=" parameter can be specified as part of the Content-type: header line. Various charset names are defined in MIME, including US-ASCII (the default), ISO-8859-1, ISO-8859-2, and so on, and many more have been registered with the Internet Assigned Numbers Authority (IANA).

Some existing systems and user agents, however, do not provide any mechanism for generating these charset labels. The,   and   channel options, when placed on a source channel, provide a mechanism to specify charset names to be inserted into message headers. Each option requires an argument giving a charset name. The names are not checked for validity. Note, however, that charset conversion can only be done on charsets specified in the MTA&#x27;s charset definition file. The names defined in this file should be used if possible.

The  charset name is used if the message contains only seven bit characters; the   name will be used if eight bit data is found in the message;   will be used if a message containing only seven bit data happens to contain one or more escape characters. If the appropriate option is not specified no character set name will be inserted during MIME processing into Content-type: header lines for text parts that lack an existing charset label.

When the presence of a  channel option on a channel causes a MIME "charset" parameter clause to be added to an incoming message, that of course also means that the message gets the more fundamental MIME-version: and Content-type: header lines added, if not already present.

New in Messaging Server 7.4-18.01, the ,  , and   channel options will also cause labelling (with the specified charset) of incoming illegal, unlabelled parameter values on MIME Content-type: or Content-disposition: header lines, when such parameters must be encoded due to their content. That is, while such parameter values have always been subject to encoding (to make them syntactically legal, if not necessarily usable) by the MTA, the new feature is that charset labelling will be inserted also, making them more usable.

Note that the  option also controls the MIME encoding of eight bit characters found in message headers (where such eight bit data is unconditionally illegal). The MTA will normally always MIME encode any such (illegal) eight bit data encountered in message headers, labelling it as the UNKNOWN charset if no  value has been specified on the current source channel. (Actual addresses are a special case. In the actual address, that is, in the RFC 822 addr-spec, where eight bit categorically must not appear, any eight bit data will be replaced by the MTA with the asterisk character, &#x2a;. Note that an RFC 822 phrase, or "personal name", however, is subject to the above described MIME encoding of any illegal eight bit, using the  charset name.)

These charset specifications never override existing labels; that is, they have no effect if a message already has a charset label or is of a type other than text.

The  option tends to be particularly useful on channels that receive unlabelled messages using Japanese or Korean character sets that contain the escape character (e.g., iso-2022-jp or iso-2022-kr).

See also:
 * Channel options
 * Character set conversion
 * headerset7 Option
 * Character sets and eight bit data channel options