[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] Add a mechanism for passing unibyte strings from lisp to mod
From: |
Eli Zaretskii |
Subject: |
Re: [PATCH] Add a mechanism for passing unibyte strings from lisp to modules. |
Date: |
Sat, 22 Jun 2024 09:50:07 +0300 |
> From: Brennan Vincent <brennan@umanwizard.com>
> Date: Fri, 21 Jun 2024 16:14:05 -0400
> Cc: emacs-devel@gnu.org
>
> > Please describe the motivation and real-life use cases for this.
>
> As far as I know, unibyte strings are the only efficient way to represent
> arbitrary binary buffers in emacs. If that’s not true, I’d be happy to be
> corrected.
>
> I think there are many possible cases where module authors will want to
> communicate binary data, but I’ll just describe one (my own). I’m working on
> a major mode that reads ELF files (whose contents it stores in a unibyte
> buffer) and provides various features like disassembling code. To do this it
> passes chunks of code to a module which in turn passes them to the Capstone
> disassembly library. To do this without being able to pass unibyte strings, I
> have to take the string of bytes, expand it to a vector of bytes, pass that
> to the module, and have the module copy each byte back out in a loop. This is
> very inefficient.
Why can't you have the module code itself read the file, instead of
getting the bytes from Emacs? Passing large amounts of bytes from
Emacs to a module is a very inefficient way of talking to modules
anyway, because Emacs is not optimized for moving text to and fro in
the shape of Lisp strings. To say nothing of the GC pressure you will
have in your mode, due to a constant consing of strings. It is best
to avoid all that to begin with.
> > In general, we want to minimize the use of unibyte strings in Emacs.
>
> Why?
Because dealing with unibyte text in Emacs is tricky and causes many
subtle bugs.
> What else should be used instead to represent arbitrary bytes?
Emacs is not a program to deal with raw bytes, except in rare
exceptional cases. Dealing with binary data is definitely NOT one of
the exceptions I'd like to see in Emacs. Emacs is primarily a
text-processing environment, so processing binary data is way off its
main purpose.
> > I also don't understand the need for unibyte-string-p, since we
> > already have multibyte-string-p.
>
> That’s fair, I only added it so I could use it as an argument to CHECK_TYPE.
You can easily use CHECK_STRING, followed by checking that the string
is unibyte.
And here you already hit the first subtlety of using unibyte text in
Emacs:
(multibyte-string-p (decode-coding-string "abcdefg" 'utf-8))
=> t
IOW, a plain-ASCII string can sometimes be a multibyte string, which
would fail your naïve test for no good reason.
- [PATCH] Add a mechanism for passing unibyte strings from lisp to modules., Brennan Vincent, 2024/06/21
- [PATCH] Add a mechanism for passing unibyte strings from lisp to modules., Brennan Vincent, 2024/06/21
- Re: [PATCH] Add a mechanism for passing unibyte strings from lisp to modules., Eli Zaretskii, 2024/06/21
- Re: [PATCH] Add a mechanism for passing unibyte strings from lisp to modules., Brennan Vincent, 2024/06/21
- Re: [PATCH] Add a mechanism for passing unibyte strings from lisp to modules.,
Eli Zaretskii <=
- Message not available
- Re: [PATCH] Add a mechanism for passing unibyte strings from lisp to modules., Eli Zaretskii, 2024/06/22
- Re: [PATCH] Add a mechanism for passing unibyte strings from lisp to modules., Andrea Corallo, 2024/06/23
- Re: [PATCH] Add a mechanism for passing unibyte strings from lisp to modules., Eli Zaretskii, 2024/06/24
- Re: [PATCH] Add a mechanism for passing unibyte strings from lisp to modules., Brennan Vincent, 2024/06/25
- Re: [PATCH] Add a mechanism for passing unibyte strings from lisp to modules., Eli Zaretskii, 2024/06/26
- Re: [PATCH] Add a mechanism for passing unibyte strings from lisp to modules., tomas, 2024/06/26
- Re: [PATCH] Add a mechanism for passing unibyte strings from lisp to modules., Eli Zaretskii, 2024/06/26
- Re: [PATCH] Add a mechanism for passing unibyte strings from lisp to modules., tomas, 2024/06/26
- Re: [PATCH] Add a mechanism for passing unibyte strings from lisp to modules., Brennan Vincent, 2024/06/26
- Re: [PATCH] Add a mechanism for passing unibyte strings from lisp to modules., Eli Zaretskii, 2024/06/26