[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [h-e-w] Processing chars above \200
From: |
John J . Xenakis |
Subject: |
Re: [h-e-w] Processing chars above \200 |
Date: |
Sun, 23 Sep 2018 15:26:10 -0400 |
Hi Eli,
> Then there's still something not right, because you shouldn't be
> having any of these problems with files that are consistently
> encoded.
> It shouldn't and it doesn't. Depending on what exactly is in your
> files, something that is still a bit of a mystery for me, Emacs
> could sometimes err if you don't tell it enough.
The particular file that triggered the original message was created
over a period of several months. During that period, text was typed,
and quotes were copied and pasted from various sources. And who
knows? Maybe one day I accidentally copied and pasted some errant
problem character. I assume that's what you're getting at.
Since Microsoft is only supplying updates once a month these days for
Windows 7, that usually means that emacs is kept open for a month.
That means that the problem file, even if it contains an errant
character, still works fine. But when I reboot the system and reload
the text file, then that's when the problem arises. I think that's
what happened this time.
Since I could have inserted the errant character at any time in the
previous month, I have no memory of exactly what operation might
have caused the problem.
That's why I keep looking for the right regex that will find such characters
for me.
> But in any case, there are commands to fix those errors right
> away, as soon as you realize something like that happens. We will
> get to that, once I understand more about the problem.
Could you tell me what those commands are?
> Is it possible that the file is encoded in UTF-16 or UTF-8? What
> happens if you visit the file like this:
> C-x RET c utf-8 RET C-x C-f FILENAME RET
> and similarly for utf-16? Does this fix the problem?
No, that makes no difference. This is definitely a 7/8-bit
ascii/extended ascii file.
By the way, how do I encode that keyboard string in Lisp? How does
one use "(universal-coding-system-argument CODING-SYSTEM)" in a macro?
> And how were those files created in the first place? I understood
> from your previous explanations that you created those files by
> copy-pasting from other applications, is that right?
As I described above.
> Can you post one such file, please? It is important that you post
> a file as a binary attachment, and it is also important to verify
> that the trick with Notepad and copy/paste works with the file you
> post.
> I'm quite sure this is caused by something very simple, because
> Notepad is certainly not smarter than Emacs wrt encodings.
OK, you can download the following:
http://jxenakis.com/gdgraphics/irbk-eeee-180923.zip
The enclosed .txt file causes all the issues that I've described.
I've replaced all the 7-bit letters with "e", because I don't
want to make the text public.
To make is easy for you to find some of the 8-bit characters causing
the problems, I inserted the string ">>>" in front of four lines
containing them. Just search for that string.
Thanks.
John