[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#47799: 28.0.50; Default `project-files' implementation doesn't work
From: |
Philipp |
Subject: |
bug#47799: 28.0.50; Default `project-files' implementation doesn't work with quoted filenames |
Date: |
Mon, 5 Jul 2021 21:05:01 +0200 |
> Am 17.05.2021 um 01:22 schrieb Dmitry Gutov <dgutov@yandex.ru>:
>
> On 16.05.2021 16:37, Philipp wrote:
>
>> One thing that came to my mind is: in general, in Elisp (not just XRef), we
>> spend lots of time parsing filenames to support remote and quoted filenames.
>> Other languages probably solve this by introducing proper types for
>> filenames (e.g. the Java Path class), which can then hold preprocessed
>> information about the underlying filesystem (or special file name handler,
>> in the case of Elisp). How about doing similar for Elisp? For example,
>> introduce a `parsed-file-name' class or structure holding the remote/quoting
>> state, or attach it to string properties? I haven't tried out that idea,
>> but I think it could significantly speed up the parsing (since we'd only
>> have to do it once and don't have to search for filename handlers all the
>> time), as well as remain backward-compatible to "plain" unparsed filenames
>> by allowing both strings and this new object type. WDYT?
>
> That sounds like an interesting idea to explore.
>
> We create/concatenate those file names inside project-files, and then "parse"
> them again to convert to local names inside xref-matches-in-files. Creating
> such structures might indeed save us on some parsing and garbage generation.
>
> Experiments and patches welcome.
>
> What I was also thinking of previously, is some "fileset" data structure
> which could contain a list of local file names and their connection in a
> separate slot. Maybe even separating the parent/root directory into a
> separate slot when feasible, to minimize GC further, though that might
> complicate applications.
>
> A more structured "file" value format might make this stuff easier to use
> indeed, and perhaps the performance difference will be negligible.
I think those are very good ideas. The "fileset" structure sounds like a
pretty good abstraction.
>
> The difficulty is having a method like project-files return one format for
> some users, and another for users who want to take advantage of this
> performance improvement. Or we break the compatibility and/or introduce a new
> method with this new behavior.
A general design approach in OOP is to not treat abstract virtual functions
(generic functions in ELisp terminology) as part of the public interface of a
type; i.e., abstract functions can be implemented, but shouldn't be called
outside of the module that defines them (project.el in this case). That allows
for changes like this: implementers could freely return the new fileset
structure because only project.el would call project-files. Not sure how much
ELisp code adheres to this principle, though. If there's too much code
(outside of project.el) that relies on project-files returning a list, we need
to indeed fall back to some of the other options.
- bug#47799: 28.0.50; Default `project-files' implementation doesn't work with quoted filenames,
Philipp <=