vile
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vile] spellflt.l: Include UTF-8 code points


From: Michael von der Heide
Subject: Re: [vile] spellflt.l: Include UTF-8 code points
Date: Sun, 23 Jun 2019 21:47:18 +0200

It works (hunspell) for me with words like "prüfen" or "Straße". Flex generates an 8-bit scanner. UTF-8 should work. Would you mind testing it?

--
Michael von der Heide


Thomas Dickey <address@hidden> schrieb am So., 23. Juni 2019, 21:24:
On Sun, Jun 23, 2019 at 07:42:26PM +0200, Michael von der Heide wrote:
> Would it be possible to include UTF-8 code points to check words containing
> umlauts?
>
> WORD          ([a-zA-Z]|\xc3[\x80-\xbf])+

lex/flex doesn't do that :-(

They use small (256-entry) tables for the character types.

I've seen a (long ago) patch to use big tables (which I've read
doesn't work well).

on my (too-long) to-do list, I have an idea which could be developed,
to provide the feature using character-classes.  That is, flex could
be modified (perhaps a month's work...)

--
Thomas E. Dickey <address@hidden>
https://invisible-island.net
ftp://ftp.invisible-island.net

reply via email to

[Prev in Thread] Current Thread [Next in Thread]