bug-gnu-libiconv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-gnu-libiconv] Support question: libiconv on system with glibc?


From: Russell McOrmond
Subject: [bug-gnu-libiconv] Support question: libiconv on system with glibc?
Date: Wed, 4 Feb 2009 12:55:21 -0500 (EST)
User-agent: Alpine 2.00 (LFD 1167 2008-08-23)


I have an environment where I would like to separate off as much of our application into a chroot() environment as possible. We figured that using the sepatate libiconv would help, so that we didn't need to bring into the chroot() environment all of glibc (IE: /usr/lib/gconv , etc).

I have been having a problem getting libiconv to work in this environment. This is a RedHat Enterprise 4 machine (glibc 2.3.4), trying to compile libiconv 1.12.

I've tried linking the application ( http://mapserver.org/ ) against libiconv and I get characters different than I expect. To isolate the issue I've compiled the application against the glibc iconv, and tried using the preloadable_libiconv.so (built using a simple `./configure ; make ; make check` where the check indicates all is fine )

checking build system type... i686-pc-linux-gnu
...
checking byte ordering... little endian



We have data that is encoded in UTF-16 which we are outputing in UTF-8 (very simple transcode), inserted into an HTML template.

The relevant part should be output in UTF-8 as:

<td>Chernozémique</td>

(Note the accented e)

Here is the test using 'od' to show the UTF-8 encoding when using the glibc version of the iconv functions.

-bash-3.00$ sh ~/test-mapserv.sh | od -c -j247 -N23
0000367   <   t   d   >   C   h   e   r   n   o   z 303 251   m   i   q
0000407   u   e   <   /   t   d   >
0000416

And here is what happens when I use the libiconv version.

-bash-3.00$ export LD_PRELOAD=/server/downloads/src/libiconv-1.12/lib/preloadable_libiconv.so
-bash-3.00$ sh ~/test-mapserv.sh | od -c -j247 -N48
0000367   <   t   d   > 344 214 200 346 240 200 346 224 200 347 210 200
0000407 346 270 200 346 274 200 347 250 200 356 244 200 346 264 200 346
0000427 244 200 347 204 200 347 224 200 346 224 200   <   /   t   d   >
0000447
-bash-3.00$


Does this type of problem seem familiar? Does the 3 byte octal sequence of 346 224 200 representing an 'e' look familiar? (If I group into 3 I see the two e's in the third and last group). Does an encoding using 3 bytes always ending in octal 200 (decimal 128) seem familar? Is something byte-swapped the wrong way?

Is there something special I need to do when building libiconv to ensure various character encodings are enabled? Is there a directory equivalent to gconv that I need to be installing and pointing to with some configuration variable/file?


Is there something different in the glibc vs libiconv functions where there may be a bug in the application (mapserver) that is visible with one library, but not the other?


In case anyone is curious how iconv is being called, the relevant code is here: http://trac.osgeo.org/mapserver/browser/trunk/mapserver/mapstring.c#L1504

The variable 'encoding' on input is set to "UTF-16" , so this is a simple conversion from UTF-16 to UTF-8.

--
 Russell McOrmond, Internet Consultant: <http://www.flora.ca/>
 Please help us tell the Canadian Parliament to protect our property
 rights as owners of Information Technology. Sign the petition!
 http://digital-copyright.ca/petition/ict/     http://KillBillC61.ca

 "The government, lobbied by legacy copyright holders and hardware
  manufacturers, can pry control over my camcorder, computer,
  home theatre, or portable media player from my cold dead hands!"

reply via email to

[Prev in Thread] Current Thread [Next in Thread]