[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gcl-devel] [Maxima-discuss] [Maxima-commits] [git] Maxima CAS branc
Re: [Gcl-devel] [Maxima-discuss] [Maxima-commits] [git] Maxima CAS branch, master, updated. branch-5_37-base-91-gd9bf6ff
Sat, 10 Oct 2015 09:56:17 -0400
Gnus/5.13 (Gnus v5.13) Emacs/23.4 (gnu/linux)
Raymond Toy <address@hidden> writes:
> I, unfortunately, don't have great hope of seeing gcl with unicode any
> time soon because the plan for supporting unicode is really
> complicated. 
>  UTF-8 strings with 21-bit Lisp character. I don't know how that's
> going to work reliably when you can index at random points in the
> string and also insert random characters into a utf-8 code
>  I suggested a really simple utf-16 with 16-bit chars to simplify
> the implementation and still cover 99-44/100% of the use cases.
> This is way easier to do with very minimal code changes.
Perhaps I should weigh in here. I do have a branch starting utf8
unicode character support, but it will have to wait until post 2.6.13.
Emacs takes this strategy, so I know its doable, and the performance is
probably a net win as the gc overhead of the larger strings will
outweigh the string access times, I'm guessing. We also had a
discussion on gcl-devel that the current approach of defining a
character to be a byte, and relying on terminals etc. to do the
translation, is legal, although not desirable as a permanent solution.
I can outline the algorithm if there is interest, but essentially a
simple one entry cache to cover the vast majority of cases of sequential
access (utf8 can do this backwards as well) together with a log(N)
special character counting from the beginning, cache, or end (making
use of parallelism in long integers) for random access, appears quite
serviceable. This is not that complicated, and can be source inlined
escaping out the most common case of no special bytes, which can be
indicated by a flag in the header.
(BTW, I've also put in open-stream-p for you in 2.6.13pre.)
> Maxima-discuss mailing list
Camm Maguire address@hidden
"The earth is but one country, and mankind its citizens." -- Baha'u'llah
- Re: [Gcl-devel] [Maxima-discuss] [Maxima-commits] [git] Maxima CAS branch, master, updated. branch-5_37-base-91-gd9bf6ff,
Camm Maguire <=