[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Emacs-diffs] emacs/doc/lispref nonascii.texi
From: |
Eli Zaretskii |
Subject: |
[Emacs-diffs] emacs/doc/lispref nonascii.texi |
Date: |
Sat, 22 Nov 2008 18:22:36 +0000 |
CVSROOT: /cvsroot/emacs
Module name: emacs
Changes by: Eli Zaretskii <eliz> 08/11/22 18:22:36
Modified files:
doc/lispref : nonascii.texi
Log message:
(Character Codes, Character Sets)
(Scanning Charsets, Translation of Characters): Update for Emacs 23.
(Chars and Bytes, Splitting Characters): Sections removed.
CVSWeb URLs:
http://cvs.savannah.gnu.org/viewcvs/emacs/doc/lispref/nonascii.texi?cvsroot=emacs&r1=1.9&r2=1.10
Patches:
Index: nonascii.texi
===================================================================
RCS file: /cvsroot/emacs/emacs/doc/lispref/nonascii.texi,v
retrieving revision 1.9
retrieving revision 1.10
diff -u -b -r1.9 -r1.10
--- nonascii.texi 1 Nov 2008 16:31:47 -0000 1.9
+++ nonascii.texi 22 Nov 2008 18:22:36 -0000 1.10
@@ -21,8 +21,6 @@
codes of individual characters.
* Character Sets:: The space of possible character codes
is divided into various character sets.
-* Chars and Bytes:: More information about multibyte encodings.
-* Splitting Characters:: Converting a character to its byte sequence.
* Scanning Charsets:: Which character sets are used in a buffer?
* Translation of Characters:: Translation tables are used for conversion.
* Coding Systems:: Coding systems are conversions for saving files.
@@ -47,10 +45,11 @@
unique number, called a @dfn{codepoint}, to each and every character.
The range of codepoints defined by Unicode, or the Unicode
@dfn{codespace}, is @code{0..10FFFF} (in hex) inclusive. Emacs
-extends this range with codepoints in the range @code{3FFF80..3FFFFF},
-which it uses for representing raw 8-bit bytes that cannot be
-interpreted as characters. Thus, a character codepoint in Emacs is a
-22-bit integer number.
+extends this range with codepoints in the range @code{110000..3FFFFF},
+which it uses for representing characters that are not unified with
+Unicode and raw 8-bit bytes that cannot be interpreted as characters
+(the latter occupy the range @code{3FFF80..3FFFFF}). Thus, a
+character codepoint in Emacs is a 22-bit integer number.
@cindex internal representation of characters
@cindex characters, representation in buffers and strings
@@ -76,10 +75,10 @@
writes text to a disk file or passes it to some other process.
Occasionally, Emacs needs to hold and manipulate encoded text or
-binary non-text data in its buffer or string. For example, when Emacs
-visits a file, it first reads the file's text verbatim into a buffer,
-and only then converts it to the internal representation. Before the
-conversion, the buffer holds encoded text.
+binary non-text data in its buffers or strings. For example, when
+Emacs visits a file, it first reads the file's text verbatim into a
+buffer, and only then converts it to the internal representation.
+Before the conversion, the buffer holds encoded text.
@cindex unibyte text
Encoded text is not really text, as far as Emacs is concerned, but
@@ -125,9 +124,15 @@
@end defun
@defun byte-to-position byte-position
-Return the buffer position, in character units, corresponding to
-byte-position @var{byte-position} in the current buffer. If
address@hidden is out of range, the value is @code{nil}.
+Return the buffer position, in character units, corresponding to given
address@hidden in the current buffer. If @var{byte-position} is
+out of range, the value is @code{nil}. In a multibyte buffer, an
+arbitrary value of @var{byte-position} can be not at character
+boundary, but inside a multibyte sequence representing a single
+character; in this case, this function returns the buffer position of
+the character whose multibyte sequence includes @var{byte-position}.
+In other words, the value does not change for all byte positions that
+belong to the same character.
@end defun
@defun multibyte-string-p string
@@ -151,10 +156,11 @@
@section Converting Text Representations
Emacs can convert unibyte text to multibyte; it can also convert
-multibyte text to unibyte, though this conversion loses information. In
-general these conversions happen when inserting text into a buffer, or
-when putting text from several strings together in one string. You can
-also explicitly convert a string's contents to either representation.
+multibyte text to unibyte, provided that the multibyte text contains
+only @acronym{ASCII} and 8-bit characters. In general, these
+conversions happen when inserting text into a buffer, or when putting
+text from several strings together in one string. You can also
+explicitly convert a string's contents to either representation.
Emacs chooses the representation for a string based on the text that
it is constructed from. The general rule is to convert unibyte text to
@@ -173,89 +179,40 @@
user that cannot be overridden automatically.
Converting unibyte text to multibyte text leaves @acronym{ASCII} characters
-unchanged, and likewise character codes 128 through 159. It converts
-the address@hidden codes 160 through 255 by adding the value
address@hidden to each character code. By setting this
-variable, you specify which character set the unibyte characters
-correspond to (@pxref{Character Sets}). For example, if
address@hidden is 2048, which is @code{(- (make-char
-'latin-iso8859-1) 128)}, then the unibyte address@hidden characters
-correspond to Latin 1. If it is 2688, which is @code{(- (make-char
-'greek-iso8859-7) 128)}, then they correspond to Greek letters.
-
- Converting multibyte text to unibyte is simpler: it discards all but
-the low 8 bits of each character code. If @code{nonascii-insert-offset}
-has a reasonable value, corresponding to the beginning of some character
-set, this conversion is the inverse of the other: converting unibyte
-text to multibyte and back to unibyte reproduces the original unibyte
-text.
-
address@hidden nonascii-insert-offset
-This variable specifies the amount to add to a address@hidden character
-when converting unibyte text to multibyte. It also applies when
address@hidden inserts a character in the unibyte
address@hidden range, 128 through 255. However, the functions
address@hidden and @code{insert-char} do not perform this conversion.
-
-The right value to use to select character set @var{cs} is @code{(-
-(make-char @var{cs}) 128)}. If the value of
address@hidden is zero, then conversion actually uses the
-value for the Latin 1 character set, rather than zero.
address@hidden defvar
-
address@hidden nonascii-translation-table
-This variable provides a more general alternative to
address@hidden You can use it to specify independently
-how to translate each code in the range of 128 through 255 into a
-multibyte character. The value should be a char-table, or @code{nil}.
-If this is address@hidden, it overrides @code{nonascii-insert-offset}.
address@hidden defvar
+unchanged, and converts bytes with codes 128 through 159 to the
+multibyte representation of raw eight-bit bytes.
-The next three functions either return the argument @var{string}, or a
-newly created string with no text properties.
+ Converting multibyte text to unibyte converts all @acronym{ASCII}
+and eight-bit characters to their single-byte form, but loses
+information for address@hidden characters by discarding all but
+the low 8 bits of each character's codepoint. Converting unibyte text
+to multibyte and back to unibyte reproduces the original unibyte text.
address@hidden string-make-unibyte string
-This function converts the text of @var{string} to unibyte
-representation, if it isn't already, and returns the result. If
address@hidden is a unibyte string, it is returned unchanged. Multibyte
-character codes are converted to unibyte according to
address@hidden or, if that is @code{nil}, using
address@hidden If the lookup in the translation table
-fails, this function takes just the low 8 bits of each character.
address@hidden defun
-
address@hidden string-make-multibyte string
-This function converts the text of @var{string} to multibyte
-representation, if it isn't already, and returns the result. If
address@hidden is a multibyte string or consists entirely of
address@hidden characters, it is returned unchanged. In particular,
-if @var{string} is unibyte and entirely @acronym{ASCII}, the returned
-string is unibyte. (When the characters are all @acronym{ASCII},
-Emacs primitives will treat the string the same way whether it is
-unibyte or multibyte.) If @var{string} is unibyte and contains
address@hidden characters, the function
address@hidden is used to convert each unibyte
-character to a multibyte character.
address@hidden defun
+The next two functions either return the argument @var{string}, or a
+newly created string with no text properties.
@defun string-to-multibyte string
This function returns a multibyte string containing the same sequence
-of character codes as @var{string}. Unlike
address@hidden, this function unconditionally returns a
-multibyte string. If @var{string} is a multibyte string, it is
-returned unchanged.
+of characters as @var{string}. If @var{string} is a multibyte string,
+it is returned unchanged.
address@hidden defun
+
address@hidden string-to-unibyte string
+This function returns a unibyte string containing the same sequence of
+characters as @var{string}. It signals an error if @var{string}
+contains a address@hidden character. If @var{string} is a
+unibyte string, it is returned unchanged.
@end defun
@defun multibyte-char-to-unibyte char
This convert the multibyte character @var{char} to a unibyte
-character, based on @code{nonascii-translation-table} and
address@hidden
+character. If @var{char} is a address@hidden character, the
+value is -1.
@end defun
@defun unibyte-char-to-multibyte char
This convert the unibyte character @var{char} to a multibyte
-character, based on @code{nonascii-translation-table} and
address@hidden
+character.
@end defun
@node Selecting a Representation
@@ -270,13 +227,13 @@
is @code{nil}, the buffer becomes unibyte.
This function leaves the buffer contents unchanged when viewed as a
-sequence of bytes. As a consequence, it can change the contents viewed
-as characters; a sequence of two bytes which is treated as one character
-in multibyte representation will count as two characters in unibyte
-representation. Character codes 128 through 159 are an exception. They
-are represented by one byte in a unibyte buffer, but when the buffer is
-set to multibyte, they are converted to two-byte sequences, and vice
-versa.
+sequence of bytes. As a consequence, it can change the contents
+viewed as characters; a sequence of three bytes which is treated as
+one character in multibyte representation will count as three
+characters in unibyte representation. Eight-bit characters
+representing raw bytes are an exception. They are represented by one
+byte in a unibyte buffer, but when the buffer is set to multibyte,
+they are converted to two-byte sequences, and vice versa.
This function sets @code{enable-multibyte-characters} to record which
representation is in use. It also adjusts various data in the buffer
@@ -291,26 +248,26 @@
@defun string-as-unibyte string
This function returns a string with the same bytes as @var{string} but
treating each byte as a character. This means that the value may have
-more characters than @var{string} has.
+more characters than @var{string} has. Eight-bit characters
+representing raw bytes are an exception: each one of them is converted
+to a single byte.
If @var{string} is already a unibyte string, then the value is
@var{string} itself. Otherwise it is a newly created string, with no
-text properties. If @var{string} is multibyte, any characters it
-contains of charset @code{eight-bit-control} or @code{eight-bit-graphic}
-are converted to the corresponding single byte.
+text properties.
@end defun
@defun string-as-multibyte string
This function returns a string with the same bytes as @var{string} but
-treating each multibyte sequence as one character. This means that the
-value may have fewer characters than @var{string} has.
+treating each multibyte sequence as one character. This means that
+the value may have fewer characters than @var{string} has. If a byte
+sequence in @var{string} is invalid as a multibyte representation of a
+single character, each byte in the sequence is treated as raw 8-bit
+byte.
If @var{string} is already a multibyte string, then the value is
@var{string} itself. Otherwise it is a newly created string, with no
-text properties. If @var{string} is unibyte and contains any individual
-8-bit bytes (i.e.@: not part of a multibyte form), they are converted to
-the corresponding multibyte character of charset @code{eight-bit-control}
-or @code{eight-bit-graphic}.
+text properties.
@end defun
@node Character Codes
@@ -320,13 +277,13 @@
The unibyte and multibyte text representations use different
character codes. The valid character codes for unibyte representation
range from 0 to 255---the values that can fit in one byte. The valid
-character codes for multibyte representation range from 0 to 4194303,
-but not all values in that range are valid. The values 128 through
-255 do not usually show up in multibyte text, but they can occur if
-you do explicit encoding and decoding (@pxref{Explicit Encoding}).
-Some other character codes cannot occur at all in multibyte text.
-Only the @acronym{ASCII} codes 0 through 127 are completely legitimate
-in both representations.
+character codes for multibyte representation range from 0 to 4194303
+(#x3FFFFF). In this code space, values 0 through 127 are for
address@hidden charcters, and values 129 through 4194175 (#x3FFF7F)
+are for address@hidden characters. Values 0 through 1114111
+(#10FFFF) corresponds to Unicode characters of the same codepoint,
+while values 4194176 (#x3FFF80) through 4194303 (#x3FFFFF) are for
+representing eight-bit raw bytes.
@defun characterp charcode
This returns @code{t} if @var{charcode} is a valid character, and
@@ -335,8 +292,6 @@
@example
(characterp 65)
@result{} t
-(characterp 256)
- @result{} nil
(characterp 4194303)
@result{} t
(characterp 4194304)
@@ -344,27 +299,45 @@
@end example
@end defun
address@hidden get-byte pos &optional string
+This function returns the byte at current buffer's character position
address@hidden If the current buffer is unibyte, this is literally the
+byte at that position. If the buffer is multibyte, byte values of
address@hidden characters are the same as character codepoints,
+whereas eight-bit raw bytes are converted to their 8-bit codes. The
+function signals an error if the character at @var{pos} is
address@hidden
+
+The optional argument @var{string} means to get a byte value from that
+string instead of the current buffer.
address@hidden defun
+
@node Character Sets
@section Character Sets
@cindex character sets
- Emacs classifies characters into various @dfn{character sets}, each of
-which has a name which is a symbol. Each character belongs to one and
-only one character set.
-
- In general, there is one character set for each distinct script. For
-example, @code{latin-iso8859-1} is one character set,
address@hidden is another, and @code{ascii} is another. An
-Emacs character set can hold at most 9025 characters; therefore, in some
-cases, characters that would logically be grouped together are split
-into several character sets. For example, one set of Chinese
-characters, generally known as Big 5, is divided into two Emacs
-character sets, @code{chinese-big5-1} and @code{chinese-big5-2}.
-
- @acronym{ASCII} characters are in character set @code{ascii}. The
address@hidden characters 128 through 159 are in character set
address@hidden, and codes 160 through 255 are in character set
address@hidden
address@hidden charset
address@hidden coded character set
+An Emacs @dfn{character set}, or @dfn{charset}, is a set of characters
+in which each character is assigned a numeric code point. (The
+Unicode standard calls this a @dfn{coded character set}.) Each
+charset has a name which is a symbol. A single character can belong
+to any number of different character sets, but it will generally have
+a different code point in each charset. Examples of character sets
+include @code{ascii}, @code{iso-8859-1}, @code{greek-iso8859-7}, and
address@hidden The code point assigned to a character in a
+charset is usually different from its code point used in Emacs buffers
+and strings.
+
address@hidden @code{emacs}, a charset
address@hidden @code{unicode}, a charset
address@hidden @code{eight-bit}, a charset
+ Emacs defines several special character sets. The character set
address@hidden includes all the characters whose Emacs code points are
+in the range @code{0..10FFFF}. The character set @code{emacs}
+includes all @acronym{ASCII} and address@hidden characters.
+Finally, the @code{eight-bit} charset includes the 8-bit raw bytes;
+Emacs uses it to represent raw bytes encountered in text.
@defun charsetp object
Returns @code{t} if @var{object} is a symbol that names a character set,
@@ -375,22 +348,38 @@
The value is a list of all defined character set names.
@end defvar
address@hidden charset-list
-This function returns the value of @code{charset-list}. It is only
-provided for backward compatibility.
address@hidden charset-priority-list &optional highestp
+This functions returns a list of all defined character sets ordered by
+their priority. If @var{highestp} is address@hidden, the function
+returns a single character set of the highest priority.
address@hidden defun
+
address@hidden set-charset-priority &rest charsets
+This function makes @var{charsets} the highest priority character sets.
@end defun
@defun char-charset character
-This function returns the name of the character set that @var{character}
-belongs to, or the symbol @code{unknown} if @var{character} is not a
-valid character.
+This function returns the name of the character set of highest
+priority that @var{character} belongs to. @acronym{ASCII} characters
+are an exception: for them, this function always returns @code{ascii}.
@end defun
@defun charset-plist charset
-This function returns the charset property list of the character set
address@hidden Although @var{charset} is a symbol, this is not the same
-as the property list of that symbol. Charset properties are used for
-special purposes within Emacs.
+This function returns the property list of the character set
address@hidden Although @var{charset} is a symbol, this is not the
+same as the property list of that symbol. Charset properties include
+important information about the charset, such as its documentation
+string, short name, etc.
address@hidden defun
+
address@hidden put-charset-property charset propname value
+This function sets the @var{propname} property of @var{charset} to the
+given @var{value}.
address@hidden defun
+
address@hidden get-charset-property charset propname
+This function returns the value of @var{charset}s property
address@hidden
@end defun
@deffn Command list-charset-chars charset
@@ -398,87 +387,21 @@
@var{charset}.
@end deffn
address@hidden Chars and Bytes
address@hidden Characters and Bytes
address@hidden bytes and characters
-
address@hidden introduction sequence (of character)
address@hidden dimension (of character set)
- In multibyte representation, each character occupies one or more
-bytes. Each character set has an @dfn{introduction sequence}, which is
-normally one or two bytes long. (Exception: the @code{ascii} character
-set and the @code{eight-bit-graphic} character set have a zero-length
-introduction sequence.) The introduction sequence is the beginning of
-the byte sequence for any character in the character set. The rest of
-the character's bytes distinguish it from the other characters in the
-same character set. Depending on the character set, there are either
-one or two distinguishing bytes; the number of such bytes is called the
address@hidden of the character set.
-
address@hidden charset-dimension charset
-This function returns the dimension of @var{charset}; at present, the
-dimension is always 1 or 2.
address@hidden defun
-
address@hidden charset-bytes charset
-This function returns the number of bytes used to represent a character
-in character set @var{charset}.
address@hidden decode-char charset code-point
+This function decodes a character that is assigned a @var{code-point}
+in @var{charset}, to the corresponding Emacs character, and returns
+that character. If @var{charset} doesn't contain a character of that
+code point, the value is @code{nil}. If @var{code-point} doesnt't fit
+in a Lisp integer (@pxref{Integer Basics, most-positive-fixnum}), it
+can be specified as a cons cell @code{(@var{high} . @var{low})}, where
address@hidden are the lower 16 bits of the value and @var{high} are the
+high 16 bits.
@end defun
- This is the simplest way to determine the byte length of a character
-set's introduction sequence:
-
address@hidden
-(- (charset-bytes @var{charset})
- (charset-dimension @var{charset}))
address@hidden example
-
address@hidden Splitting Characters
address@hidden Splitting Characters
address@hidden character as bytes
-
- The functions in this section convert between characters and the byte
-values used to represent them. For most purposes, there is no need to
-be concerned with the sequence of bytes used to represent a character,
-because Emacs translates automatically when necessary.
-
address@hidden split-char character
-Return a list containing the name of the character set of
address@hidden, followed by one or two byte values (integers) which
-identify @var{character} within that character set. The number of byte
-values is the character set's dimension.
-
-If @var{character} is invalid as a character code, @code{split-char}
-returns a list consisting of the symbol @code{unknown} and @var{character}.
-
address@hidden
-(split-char 2248)
- @result{} (latin-iso8859-1 72)
-(split-char 65)
- @result{} (ascii 65)
-(split-char 128)
- @result{} (eight-bit-control 128)
address@hidden example
address@hidden defun
-
address@hidden FIXME: update split-char and make-char
address@hidden generate characters in charsets
address@hidden make-char charset &optional code1 code2
-This function returns the character in character set @var{charset} whose
-position codes are @var{code1} and @var{code2}. This is roughly the
-inverse of @code{split-char}. Normally, you should specify either one
-or both of @var{code1} and @var{code2} according to the dimension of
address@hidden For example,
-
address@hidden
-(make-char 'latin-iso8859-1 72)
- @result{} 2248
address@hidden example
-
-Actually, the eighth bit of both @var{code1} and @var{code2} is zeroed
-before they are used to index @var{charset}. Thus you may use, for
-instance, an ISO 8859 character code rather than subtracting 128, as
-is necessary to index the corresponding Emacs charset.
address@hidden encode-char char charset
+This function returns the code point assigned to the character
address@hidden in @var{charset}. If @var{charset} doesn't contain
address@hidden, the value is @code{nil}.
@end defun
@node Scanning Charsets
@@ -490,15 +413,16 @@
of the text in question.
@defun charset-after &optional pos
-This function return the charset of a character in the current buffer
-at position @var{pos}. If @var{pos} is omitted or @code{nil}, it
-defaults to the current value of point. If @var{pos} is out of range,
-the value is @code{nil}.
+This function returns the charset of highest priority containing the
+character in the current buffer at position @var{pos}. If @var{pos}
+is omitted or @code{nil}, it defaults to the current value of point.
+If @var{pos} is out of range, the value is @code{nil}.
@end defun
@defun find-charset-region beg end &optional translation
-This function returns a list of the character sets that appear in the
-current buffer between positions @var{beg} and @var{end}.
+This function returns a list of the character sets of highest priority
+that contain charcters in the current buffer between positions
address@hidden and @var{end}.
The optional argument @var{translation} specifies a translation table to
be used in scanning the text (@pxref{Translation of Characters}). If it
@@ -508,10 +432,10 @@
@end defun
@defun find-charset-string string &optional translation
-This function returns a list of the character sets that appear in the
-string @var{string}. It is just like @code{find-charset-region}, except
-that it applies to the contents of @var{string} instead of part of the
-current buffer.
+This function returns a list of the character sets of highest priority
+that contain characters in @var{string}. It is just like
address@hidden, except that it applies to the contents of
address@hidden instead of part of the current buffer.
@end defun
@node Translation of Characters
@@ -519,19 +443,17 @@
@cindex character translation tables
@cindex translation tables
- A @dfn{translation table} is a char-table that specifies a mapping
-of characters into characters. These tables are used in encoding and
-decoding, and for other purposes. Some coding systems specify their
-own particular translation tables; there are also default translation
-tables which apply to all other coding systems.
-
- For instance, the coding-system @code{utf-8} has a translation table
-that maps characters of various charsets (e.g.,
address@hidden@var{x}}) into Unicode character sets. This way,
-it can encode Latin-2 characters into UTF-8. Meanwhile,
address@hidden operates by specifying
address@hidden to translate
address@hidden characters into corresponding Unicode characters.
+ A @dfn{translation table} is a char-table (@pxref{Char-Tables}) that
+specifies a mapping of characters into characters. These tables are
+used in encoding and decoding, and for other purposes. Some coding
+systems specify their own particular translation tables; there are
+also default translation tables which apply to all other coding
+systems.
+
+ A translation table has two extra slots. The first is either
address@hidden or a translation table that performs the reverse
+translation; the second is the maximum number of characters to look up
+for translation.
@defun make-translation-table &rest translations
This function returns a translation table based on the argument
@@ -545,33 +467,65 @@
@var{to-alt}.
@end defun
- In decoding, the translation table's translations are applied to the
-characters that result from ordinary decoding. If a coding system has
-property @code{translation-table-for-decode}, that specifies the
-translation table to use. (This is a property of the coding system,
-as returned by @code{coding-system-get}, not a property of the symbol
-that is the coding system's name. @xref{Coding System Basics,, Basic
-Concepts of Coding Systems}.) Otherwise, if
address@hidden is address@hidden,
-decoding uses that table.
-
- In encoding, the translation table's translations are applied to the
-characters in the buffer, and the result of translation is actually
-encoded. If a coding system has property
address@hidden, that specifies the translation
-table to use. Otherwise the variable
address@hidden specifies the translation
-table.
+ During decoding, the translation table's translations are applied to
+the characters that result from ordinary decoding. If a coding system
+has property @code{:decode-translation-table}, that specifies the
+translation table to use, or a list of translation tables to apply in
+sequence. (This is a property of the coding system, as returned by
address@hidden, not a property of the symbol that is the
+coding system's name. @xref{Coding System Basics,, Basic Concepts of
+Coding Systems}.) Finally, if
address@hidden is address@hidden, the
+resulting characters are translated by that table.
+
+ During encoding, the translation table's translations are applied to
+the characters in the buffer, and the result of translation is
+actually encoded. If a coding system has property
address@hidden:encode-translation-table}, that specifies the translation table
+to use, or a list of translation tables to apply in sequence. In
+addition, if the variable @code{standard-translation-table-for-encode}
+is address@hidden, it specifies the translation table to use for
+translating the result.
@defvar standard-translation-table-for-decode
-This is the default translation table for decoding, for
-coding systems that don't specify any other translation table.
+This is the default translation table for decoding. If a coding
+systems specifies its own translation tables, the table that is the
+value of this variable, if address@hidden, is applied after them.
@end defvar
@defvar standard-translation-table-for-encode
-This is the default translation table for encoding, for
-coding systems that don't specify any other translation table.
address@hidden defvar
+This is the default translation table for encoding. If a coding
+systems specifies its own translation tables, the table that is the
+value of this variable, if address@hidden, is applied after them.
address@hidden defvar
+
address@hidden make-translation-table-from-vector vec
+This function returns a translation table made from @var{vec} that is
+an array of 256 elements to map byte values 0 through 255 to
+characters. Elements may be @code{nil} for untranslated bytes. The
+returned table has a translation table for reverse mapping in the
+first extra slot.
+
+This function provides an easy way to make a private coding system
+that maps each byte to a specific character. You can specify the
+returned table and the reverse translation table using the properties
address@hidden:decode-translation-table} and @code{:encode-translation-table}
+respectively in the @var{props} argument to
address@hidden
address@hidden defun
+
address@hidden make-translation-table-from-alist alist
+This function is similar to @code{make-translation-table} but returns
+a complex translation table rather than a simple one-to-one mapping.
+Each element of @var{alist} is of the form @code{(@var{from}
+. @var{to})}, where @var{from} and @var{to} are either a character or
+a vector specifying a sequence of characters. If @var{from} is a
+character, that character is translated to @var{to} (i.e.@: to a
+character or a character sequence). If @var{from} is a vector of
+characters, that sequence is translated to @var{to}. The returned
+table has a translation table for reverse mapping in the first extra
+slot.
address@hidden defun
@node Coding Systems
@section Coding Systems
- [Emacs-diffs] emacs/doc/lispref nonascii.texi,
Eli Zaretskii <=