[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
dired-do-find-regexp failure with latin-1 encoding
From: |
Stephen Berman |
Subject: |
dired-do-find-regexp failure with latin-1 encoding |
Date: |
Sat, 28 Nov 2020 19:03:17 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) |
My system's language encoding is en_US.UTF-8 but I have many files
encoded as iso-8859-1 (latin-1) and containing a mix of ASCII and
non-ASCII characters. When I use dired-do-find-regexp on such files,
there are no matches in the *xref* buffer for lines containing both the
search string and a non-ASCII character. If the file is encoded as
utf-8, then dired-do-find-regexp does find such lines. Here's a minimal
reproducer:
0. echo aä > /tmp/test
1. emacs -Q /tmp/test ; the file encoding is utf-8
2. Type `C-x d RET', mark the file 'test', type `A a RET'
=> *xref* displays the line 'aä'
3. In buffer 'test' type `C-x RET f iso-8859-1 RET' and then `C-x C-s'
4. Repeat step 2
=> user-error: No matches for: a
dired-do-find-regexp calls xref-matches-in-files and that calls grep,
and that's where the failure happens, so strictly speaking this isn't an
Emacs bug, but it is a problem for users of dired-do-find-regexp
(dired-do-search and occur, for example, don't have this problem). One
workaround is to add the -a option to the grep invocation in
xref-matches-in-files; then the search succeeds and the *xref* buffer
displays 'a\344'. But this doesn't work if 'ä' is the search term. For
the latter, I can get the correct output from grep by piping the output
of 'iconv -f ISO-8859-1 -t UTF-8' through to it, and indeed, prepending
'iconv -f ISO-8859-1 -t UTF-8 | ' to the grep invocation in
xref-matches-in-files does give the correct output in both cases. But
this won't work if the file has a different non-utf-8 encoding, assuming
the issue isn't specific to latin-1. Is there another alternative
(aside from "Someone™ can implement it in Emacs Lisp")?
Steve Berman
- dired-do-find-regexp failure with latin-1 encoding,
Stephen Berman <=
- Re: dired-do-find-regexp failure with latin-1 encoding, Eli Zaretskii, 2020/11/28
- Re: dired-do-find-regexp failure with latin-1 encoding, Stephen Berman, 2020/11/28
- Re: dired-do-find-regexp failure with latin-1 encoding, Eli Zaretskii, 2020/11/28
- Re: dired-do-find-regexp failure with latin-1 encoding, Stephen Berman, 2020/11/28
- Re: dired-do-find-regexp failure with latin-1 encoding, Eli Zaretskii, 2020/11/28
- Re: dired-do-find-regexp failure with latin-1 encoding, Dmitry Gutov, 2020/11/28
- Re: dired-do-find-regexp failure with latin-1 encoding, Eli Zaretskii, 2020/11/28
- Re: dired-do-find-regexp failure with latin-1 encoding, Dmitry Gutov, 2020/11/28
- Re: dired-do-find-regexp failure with latin-1 encoding, Dmitry Gutov, 2020/11/28
- Re: dired-do-find-regexp failure with latin-1 encoding, Eli Zaretskii, 2020/11/29