bug-gnu-libiconv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gnu-libiconv] iconv Bug report


From: Bruno Haible
Subject: Re: [bug-gnu-libiconv] iconv Bug report
Date: Tue, 19 Jun 2007 00:35:17 +0200
User-agent: KMail/1.5.4

Dear 刘,

>          I find a bug at libiconv..if convert GBK to UTF-8 or UCS-2 with
> libiconv, probably will get error text.
> 
>          Example: a GBK encoding text “0xa3 0xa0 0xb0 0xa1”

The byte sequence 0xa3 0xa0 is not valid GBK.

To find out the encoding of this byte sequence, you can unpack a libiconv
distribution, and in the tests/ directory you find the conversion tables
for most supported character sets. When I do

   $ cd tests
   $ grep ^0xA3A0 *.TXT

I obtain the result:

   CP949.TXT:0xA3A0        0xC9DB
   GB18030-BMP.TXT:0xA3A0  0xE5E5

This means that 0xa3 0xa0 is valid in CP949 - but this is Korean, hence
not your case - and valid GB18030. So, if you specify "GB18030" instead of
"GBK", it should work.

For more details about chinese character sets, see
     http://www.haible.de/bruno/charsets/conversion-tables/Chinese.html
For advice regarding labelling of text, see
     http://www.haible.de/bruno/charsets/advice.html

>          I’m from Chinese and poor English,so I can’t write detailed. 

You write an understandable English, no problem. Maybe a dictionary, or a
translation tool like
     http://babelfish.altavista.com/
     http://www.google.de/language_tools
can help you get more expressive in English.

Bruno





reply via email to

[Prev in Thread] Current Thread [Next in Thread]