[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#52263: Stale comment in xsd-regexp.el about Emacs not supporting Uni
From: |
Stefan Kangas |
Subject: |
bug#52263: Stale comment in xsd-regexp.el about Emacs not supporting Unicode |
Date: |
Sat, 4 Dec 2021 14:07:46 +0100 |
Eli Zaretskii <eliz@gnu.org> writes:
>> I believe this comment in lisp/nxml/xsd-regexp.el can be removed as
>> Emacs supports Unicode now:
>>
>> ;; The semantics of XSD regexps are defined in terms of Unicode.
>> ;; Non-Unicode characters are not allowed in regular expressions and
>> ;; will not match against the generated regular expressions. A
>> ;; Unicode character means a character in one of the Mule charsets
>> ;; ascii, latin-iso8859-1, mule-unicode-0100-24ff,
>> ;; mule-unicode-2500-33ff, mule-unicode-e000-ffff, eight-bit-control
>> ;; or a character translatable to such a character (i.e a character
>> ;; for which `encode-char' will return non-nil).
>> ;;
>> ;; Unfortunately, this means that this package is currently useless
>> ;; for CJK characters, since there's no mule-unicode charset for the
>> ;; CJK ranges of Unicode. We should devise a workaround for this
>> ;; until the fabled Unicode version of Emacs makes an appearance.
>>
>> Is that correct?
>
> Probably. The mule-Unicode-* stuff is definitely obsolete. The only
> thing that bothers me is what happens with eight-bit characters in the
> XSD regexps -- are they allowed? Emacs in general does allow them.
> If xsd-regexp.el doesn't, that should be stated there.
Hmm, so probably more work is needed here than just removing the above
comment. There is a lot of non-trivial mule and conversion stuff going
on in that library that might need a proper look by someone that knows
this stuff well.
Perhaps this bug should also be retitled accordingly.