On Thu, Apr 11, 2024 at 8:39 PM Eli Zaretskii <
eliz@gnu.org> wrote:
> From: Michael Lausch <mick.lausch@gmail.com>
> Date: Thu, 11 Apr 2024 19:38:52 +0200
>
> When loading a treesitter grammar in GNU/Linux, the dlopen() call is used with the RTLD_GLOBAL flag set. If
> you load more than one treesitter grammer, and both grammars define the same functions, most probably in
> the scanner.c file, symbol resolution may use the wrong symbol.
> For example the org and the yaml grammar both define a deserialize() function in their scanner.c file. This
> may result a call from the org grammar to the yaml defined deserialize() function. This fails, because the yaml
> function does different things than the org grammer expects (it's a free of a dangling pointer and therefore
> emacs crashes).
>
> A solution can be:
> 1) use a special call to dlopen without the RTLD_OPEN flag, sim,ilar to what the eln loader does.
> 2) fix all the grammars and make all functions 'static' so that the functions are not visible outside the
> compilation unit.
> 3) something i didn't think about
If those 'serialize' functions are not needed to be called from
outside of the shared library, the usual way is not to export them,
i.e. to give all symbols except the few that need to be exported the
so-called "hidden visibility".
I agree that this would be the cleanest way to solve the problem, but that would mean to patch all the existing grammars and maybe all the future grammars and push the changes to their maintainers.
I started to prep patches for the yaml and org grammar (those were the ones which triggered the bug for me) and i'm going to have them merged upstream.