bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#52263: Stale comment in xsd-regexp.el about Emacs not supporting Uni


From: Stefan Kangas
Subject: bug#52263: Stale comment in xsd-regexp.el about Emacs not supporting Unicode
Date: Sat, 4 Dec 2021 14:07:46 +0100

Eli Zaretskii <eliz@gnu.org> writes:

>> I believe this comment in lisp/nxml/xsd-regexp.el can be removed as
>> Emacs supports Unicode now:
>>
>>     ;; The semantics of XSD regexps are defined in terms of Unicode.
>>     ;; Non-Unicode characters are not allowed in regular expressions and
>>     ;; will not match against the generated regular expressions.  A
>>     ;; Unicode character means a character in one of the Mule charsets
>>     ;; ascii, latin-iso8859-1, mule-unicode-0100-24ff,
>>     ;; mule-unicode-2500-33ff, mule-unicode-e000-ffff, eight-bit-control
>>     ;; or a character translatable to such a character (i.e a character
>>     ;; for which `encode-char' will return non-nil).
>>     ;;
>>     ;; Unfortunately, this means that this package is currently useless
>>     ;; for CJK characters, since there's no mule-unicode charset for the
>>     ;; CJK ranges of Unicode.  We should devise a workaround for this
>>     ;; until the fabled Unicode version of Emacs makes an appearance.
>>
>> Is that correct?
>
> Probably.  The mule-Unicode-* stuff is definitely obsolete.  The only
> thing that bothers me is what happens with eight-bit characters in the
> XSD regexps -- are they allowed?  Emacs in general does allow them.
> If xsd-regexp.el doesn't, that should be stated there.

Hmm, so probably more work is needed here than just removing the above
comment.  There is a lot of non-trivial mule and conversion stuff going
on in that library that might need a proper look by someone that knows
this stuff well.

Perhaps this bug should also be retitled accordingly.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]