[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: fixing url-unhex-string for unicode/multi-byte charsets
From: |
Boruch Baum |
Subject: |
Re: fixing url-unhex-string for unicode/multi-byte charsets |
Date: |
Fri, 6 Nov 2020 07:28:46 -0500 |
User-agent: |
NeoMutt/20180716 |
On 2020-11-06 14:04, Eli Zaretskii wrote:
> > Date: Fri, 6 Nov 2020 05:27:56 -0500
> > From: Boruch Baum <boruch_baum@gmx.com>
> > Cc: emacs-devel@gnu.org
> I can't, not in full: I don't have a Freedesktop trash anywhere I have
> access to. I did try the 2 file names you posted, including the one
> with Hebrew characters, and it did work for me, on the assumption that
> file-name-coding-system is UTF-8.
>
> > To reproduce, touch and then trash a file named some two Hebrew
> > words delimited by a space. Navigate to the trash directory's 'info'
> > sub-directory and extract the 'path' value from the file's meta-data
> > .info file. That's the string we need to decode. Apply the string to
> > your solution and see that you do not get the space-delimited two
> > Hebrew words.
>
> A stand-alone test case, which doesn't require an actual trash, would
> be appreciated, so I could see which parrt doesn't work, and how to
> fix it.
That would be the two file names that I previously posted. You say that
they succeeded for you, but they didn't for me. The result I got was
good for the first case (English two words), and garbage for the second
case (Hebrew two words).
> Alternatively, maybe you could explain why you needed to insert the
> text into a temporary buffer and then extract it from there? AFAIK,
> we have the same primitives that work on decoding strings as we have
> for decoding buffer text.
I don't need to. It's implementation done in emacs-w3m. I also pointed
out that eww does it differently. I think the need in emacs-w3m is to
mix the ascii characters and selected binary output, which can't be done
with say replace-regexp-in-string. So what they do is use a temporary
buffer, set `buffer-multibyte' to nil, and instead of
replace-regexp-in-string build the result in the temporary buffer.
--
hkp://keys.gnupg.net
CA45 09B5 5351 7C11 A9D1 7286 0036 9E45 1595 8BC0
- fixing url-unhex-string for unicode/multi-byte charsets, Boruch Baum, 2020/11/06
- Re: fixing url-unhex-string for unicode/multi-byte charsets, Eli Zaretskii, 2020/11/06
- Re: fixing url-unhex-string for unicode/multi-byte charsets, Boruch Baum, 2020/11/06
- Re: fixing url-unhex-string for unicode/multi-byte charsets, Eli Zaretskii, 2020/11/06
- Re: fixing url-unhex-string for unicode/multi-byte charsets,
Boruch Baum <=
- Re: fixing url-unhex-string for unicode/multi-byte charsets, Eli Zaretskii, 2020/11/06
- Re: fixing url-unhex-string for unicode/multi-byte charsets, Stefan Monnier, 2020/11/06
- Re: fixing url-unhex-string for unicode/multi-byte charsets, Eli Zaretskii, 2020/11/06
- Re: fixing url-unhex-string for unicode/multi-byte charsets, Boruch Baum, 2020/11/08
- Re: fixing url-unhex-string for unicode/multi-byte charsets, Stefan Monnier, 2020/11/08
- Re: fixing url-unhex-string for unicode/multi-byte charsets, Eli Zaretskii, 2020/11/08
- Re: fixing url-unhex-string for unicode/multi-byte charsets, Stefan Monnier, 2020/11/06