[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gnu-libiconv] iconv Bug report
From: |
Bruno Haible |
Subject: |
Re: [bug-gnu-libiconv] iconv Bug report |
Date: |
Tue, 19 Jun 2007 00:35:17 +0200 |
User-agent: |
KMail/1.5.4 |
Dear 刘,
> I find a bug at libiconv..if convert GBK to UTF-8 or UCS-2 with
> libiconv, probably will get error text.
>
> Example: a GBK encoding text “0xa3 0xa0 0xb0 0xa1”
The byte sequence 0xa3 0xa0 is not valid GBK.
To find out the encoding of this byte sequence, you can unpack a libiconv
distribution, and in the tests/ directory you find the conversion tables
for most supported character sets. When I do
$ cd tests
$ grep ^0xA3A0 *.TXT
I obtain the result:
CP949.TXT:0xA3A0 0xC9DB
GB18030-BMP.TXT:0xA3A0 0xE5E5
This means that 0xa3 0xa0 is valid in CP949 - but this is Korean, hence
not your case - and valid GB18030. So, if you specify "GB18030" instead of
"GBK", it should work.
For more details about chinese character sets, see
http://www.haible.de/bruno/charsets/conversion-tables/Chinese.html
For advice regarding labelling of text, see
http://www.haible.de/bruno/charsets/advice.html
> I’m from Chinese and poor English,so I can’t write detailed.
You write an understandable English, no problem. Maybe a dictionary, or a
translation tool like
http://babelfish.altavista.com/
http://www.google.de/language_tools
can help you get more expressive in English.
Bruno