bug-gnu-libiconv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-gnu-libiconv] iconv considers invalid UTF-8 sequence as valid


From: Ary Borenszweig
Subject: [bug-gnu-libiconv] iconv considers invalid UTF-8 sequence as valid
Date: Mon, 26 Sep 2016 13:45:22 -0300

Steps to reproduce:

1. Create a file with 4 bytes with the given values: 247, 178, 187, 190

You can use this Ruby script for this:

~~~
File.open("invalid.txt", "w") do |file|
  file << "\xf7\xb2\xbb\xbe"
end
~~~

2. Execute `iconv -f UTF-8 -t UTF-8 invalid.txt`

Expected: iconv should say "cannot convert"
Actual: it works, we get the same bytes as the input (you can see this if you put the result of iconv in another file)

The first byte of value 247 is not valid in UTF-8, the maximum allowed value for it is 0xF4 (244). It seems a value bigger than that for the first byte is not valid because the maximum codepoint value of 0x10FFFF.

As a reference, we found this bug when using iconv in Crystal: https://github.com/crystal-lang/crystal/issues/3342#issuecomment-249452738

reply via email to

[Prev in Thread] Current Thread [Next in Thread]