bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#61726: [PATCH] Eglot: Support positionEncoding capability


From: Augusto Stoffel
Subject: bug#61726: [PATCH] Eglot: Support positionEncoding capability
Date: Thu, 23 Feb 2023 12:46:48 +0100

On Thu, 23 Feb 2023 at 12:39, Eli Zaretskii wrote:

>> I would also suggest preparing the stage to eventually make
>> `eglot-current-column-function' and `eglot-move-to-column-function'
>> obsolete.  For that, I suggest renaming
> Please tell more about this, as I don't think I have a clear enough
> idea of the issues and the implications for Emacs.

These vars were meant to make nonconformant servers (regarding the way
they count character offsets) work with Eglot.  A new addition to the
LSP spec allows the server can announce how it counts character offsets,
so ther should be no reason for servers to be nonconformant hence no
reasons for a workaround variable.

>> +(defun eglot--current-column-utf-8 ()
>> +  "Calculate current column, counting bytes."
>> +  (- (position-bytes (point)) (position-bytes (line-beginning-position))))
>
> This is subtly incorrect: position-bytes doesn't cound UTF-8 bytes, it
> counts the bytes in the internal representation Emacs uses for buffer
> and string text.  The differences are minor and subtle, but not
> negligible.

Right, if the buffer contains a char outside of the Unicode range, we
lose.

But just to confirm: position-bytes and byte-to-position are always with
respect to Emacs's internal extended UTF-8 representation and have
nothing to do with the buffer file enconding, right?

> What does this stuff do with double-width or zero-width characters?
> Emacs takes character-width into consideration when it counts columns,
> but it is unclear to me what do LSP servers do in those cases.
> Likewise with characters that are composed on display.

`eglot-move-to-column' is supposed so count Unicode codepoints, so
e.g. x, ⇒ and 😃 all contribute 1 unit.  One the other hand, the Emoji
🧛‍♀️ contributes 4 units. This is independent of with screen display.

By the way, I don't undertand your claim about column counting.  If I
move point over 🧛‍♀️, the mode line column count increments by 3 units,
which seems to make no sense: this Emoji is 4 codepoints longs and
occupies 1 screen column.  What's the logic here?

> So I think this mess needs to be carefully and elaborately discussed
> before we decide how to implement it correctly.

Sure.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]