Re: decode-coding-string on invalid UTF-8 string isn't rejected

emacs-pretest-bug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: decode-coding-string on invalid UTF-8 string isn't rejected

From:	Kenichi Handa
Subject:	Re: decode-coding-string on invalid UTF-8 string isn't rejected
Date:	Wed, 12 Mar 2003 09:51:19 +0900 (JST)

In article <address@hidden>, Simon Josefsson <address@hidden> writes:
> I'm trying to use decode-coding-string to "guess" charsets, and
> noticed this:

> (decode-coding-string "r\xe4k" 'latin-1)
>  => "räk"
> (decode-coding-string "r\xe4k" 'utf-8)
>  => "r"

> Wouldn't it be more appropriate if it returned nil (like
> `decode-char') or "rk"?

I've just fixed it to return "r\xe4k", i.e., invalid 8-bit
bytes are decoded into eight-bit-control or
eight-bit-graphic characters as the other cases.   Please
try the latest CVS HEAD.

> Perhaps I'm looking in the wrong place though.  Is there a elisp
> function that takes a unibyte string and decodes it using whatever the
> default (process) coding system priorities may be?  I.e., for me that
> runs in a UTF-8 locale, first try decoding as utf-8, if it fails,
> continue with Latin-1, etc.

(decode-coding-string UNIBYTE_STRING 'undecided) should work
as your purpose.

---
Ken'ichi HANDA
address@hidden

[Prev in Thread]

Current Thread

[Next in Thread]

decode-coding-string on invalid UTF-8 string isn't rejected, Simon Josefsson, 2003/03/07
- Re: decode-coding-string on invalid UTF-8 string isn't rejected, Kenichi Handa <=

Prev by Date: Re: wrong coding system of lisp/ChangeLog
Next by Date: Re: wrong coding system of lisp/ChangeLog
Previous by thread: decode-coding-string on invalid UTF-8 string isn't rejected
Next by thread: errors runnning make bootstrap
Index(es):
- Date
- Thread