1. Computing & Technology

Discuss in my forum

Recognize Coding

By , About.com Guide

Specify Coding).

You can insert any possible character into any Emacs buffer, but most coding systems can only handle some of the possible characters. This means that it is possible for you to insert characters that cannot be encoded with the coding system that will be used to save the buffer. For example, you could start with an ASCII file and insert a few Latin-1 characters into it, or you could edit a text file in Polish encoded in iso-8859-2 and add some Russian words to it. When you save the buffer, Emacs cannot use the current value of buffer-file-coding-system, because the characters you added cannot be encoded by that coding system.

When that happens, Emacs tries the most-preferred coding system (set by M-x prefer-coding-system or M-x set-language-environment), and if that coding system can safely encode all of the characters in the buffer, Emacs uses it, and stores its value in buffer-file-coding-system. Otherwise, Emacs displays a list of coding systems suitable for encoding the buffer's contents, and asks you to choose one of those coding systems.

If you insert the unsuitable characters in a mail message, Emacs behaves a bit differently. It additionally checks whether the most-preferred coding system is recommended for use in MIME messages; if not, Emacs tells you that the most-preferred coding system is not recommended and prompts you for another coding system. This is so you won't inadvertently send a message encoded in a way that your recipient's mail software will have difficulty decoding. (If you do want to use the most-preferred coding system, you can still type its name in response to the question.)

When you send a message with Mail mode (see Sending Mail), Emacs has four different ways to determine the coding system to use for encoding the message text. It tries the buffer's own value of buffer-file-coding-system, if that is non-nil. Otherwise, it uses the value of sendmail-coding-system, if that is non-nil. The third way is to use the default coding system for new files, which is controlled by your choice of language environment, if that is non-nil. If all of these three values are nil, Emacs encodes outgoing mail using the Latin-1 coding system.

When you get new mail in Rmail, each message is translated automatically from the coding system it is written in, as if it were a separate file. This uses the priority list of coding systems that you have specified. If a MIME message specifies a character set, Rmail obeys that specification, unless rmail-decode-mime-charset is nil.

For reading and saving Rmail files themselves, Emacs uses the coding system specified by the variable rmail-file-coding-system. The default value is nil, which means that Rmail files are not translated (they are read and written in the Emacs internal character code).

©2012 About.com. All rights reserved.

A part of The New York Times Company.