|
From: | Ian Eure |
Subject: | Re: replace placeholders |
Date: | Sun, 1 Mar 2009 09:55:22 -0800 |
On Feb 28, 2009, at 11:18 PM, henry atting wrote:
I have a file which was converted from dos to unix and from latin1 to utf-8. Now it is speckeld with all these placeholders (\226) for non presentable signs.
It sounds like either your transcoding to UTF-8 is broken, or you're viewing the file with the wrong encoding.
\226 (0xE2) is LATIN SMALL LETTER A WITH CIRCUMFLEX in ISO-8859-1, so if that was present in the input it should have been converted to 0xC3 0xA2.
Alternately, it could be the start of a UTF-8 encoded point from the general punctuation block (e.g. curly quotes), which are all three bytes starting with 0xE2. This would point to your editor reading the file with the wrong encoding.
Either way, I don't think simply removing the characters is the correct solution.
- Ian
[Prev in Thread] | Current Thread | [Next in Thread] |