liberty-eiffel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Libunicode


From: Paolo Redaelli
Subject: Re: Libunicode
Date: Fri, 7 Jan 2022 11:49:39 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.14.0

Il 07/01/22 01:10, Eric Bezault ha scritto:
On 07/01/2022 0:05, Paolo Redaelli wrote:
I also searched eiffel.org and gobo eiffel for some hints. Aren't they the "professionals"?

I think that neither is "professional". Only ISE and eiffel.com is.
I'm sorry, I must apologize for having been rude and judgy. 😭

Sure, eiffel.org is managed by people working at ISE, but it is supposed
to be a "community" platform.

My ignorance misguided me. I searched "UNICODE STRING" on eiffel.org founding no information for newbies or dummies. Then I realized - as you correctly pointed out - that eiffel.org is a community-driven so I told myself that eiffel.com should have extensive documentation of their standard libraries. On www.eiffel.com the links "Documentation" and "Libraries" under the main menĂš resources point to pages on eiffel.org.

I armed myself with patience to find any description of string-related classes, so I followed documentation link, landing on https://www.eiffel.org/documentation where I chose "Solutions and libraries". Thinking that string is text I first tried "Text processing" founding EiffelLex and EiffelParse. Following EIffelParse after a couple of more links I was led to https://www.eiffel.org/files/doc/static/21.11/libraries/parse/index.html which could not be found. The same applies when you follow "Basic computing" where I thought I would have found base classes like INTEGER and the like.

I knew what I was looking for and I wasn't able to find it; I guess that people interested in learning Eiffel could come and making queries like https://duckduckgo.com/?q=unicode+string+site%3Aeiffel.com or https://duckduckgo.com/?q=unicode+string+site%3Aeiffel.org

Most probably they will be confused by the answers of the search engine. Earnestly an interested person would be also confused looking in liberty-eiffel domain too https://duckduckgo.com/?q=unicode+string+site%3Aliberty-eiffel.org


As for Gobo Eiffel, it's no more professional than Liberty Eiffel.
...
See https://www.youtube.com/watch?v=faF8p5Qnbeo&t=210s

Well, you're by far much more professional than me



That being said, here are some info about Unicode support in ISE Eiffel
and Gobo Eiffel.

Gobo Eiffel was a pioneer in supporting Unicode in
....
that were needed at the time they were developed.

This is enlightining and much appreciated! Thank you.


I hope that you find the above useful, even if it's not a
"professional" documentation :-)
They are much more than "professional" :)

Now let me try to get something good from my misstep. 😅

I guess many Eiffellers are not young anymore, having grown when efficiency were a dire necessity and correctness was a hard target.

Therefore we ponder countless hours on topics like string encodings and how to correctly and efficiently deal with it.

Then I think about my 12 years old daughter or the "Linux/hacking group" of engineering students of Politecnico di Milano of today. To them "text is text, why bother?". They take Unicode for granted. That's the default to them. I thought about them after recalling https://vino.dev/blog/node-to-rust-day-1-rustup/ and expecially this passage from https://vino.dev/blog/node-to-rust-day-5-ownership/

"if you’ve spent most of your life in _javascript_ or had horrible experiences with languages like C, you may be thinking: “References? Whatever. I don’t like references and I don’t need references.” I need to let you in on a secret. You use references literally all the time in _javascript_. Every object is a reference. That’s how you can pass an object to a function, edit a property, and have that change be reflected after the function finishes."

It seems that the vast majority of programmers today doesn't know the difference between pass by value or by reference (expanded/unexpanded) and I'm sure that they will see the difference between 8bits/ascii strings and "text string" (aka Unicode) as some relic of the past. To them 8 bits ASCII strings are nothing more than legacy.

And most of them doesn't even care if the documentation takes few or hundred megabytes on storage. Using 1 or 4Mb to store a book of a million characters don't make much difference for them.

I'm convinced that two aspects matter most today:

1 - a correct choice of the underlying computational complexity, i.e. choose the right algorithm and

2 - providing an API that doesn't "scare people" as I read yesterday on https://it.slashdot.org/comments.pl?sid=20556251&cid=62148999

"Using that brainpower on high cognitive load programming languages, like the ones that pretty much require an IDE to help you navigate the Byzantine bullshit they call their standard library, seems like a waste to me, but then, my opinion never influenced the IDE market."

I think this comment refers to Java.

If "default, modern string" are all encoded in UTF8, then random access can't be an O(1) and it will make all algorithms relying on cheap random access really slow. Those will need UTF-32 (or 16).

Perhaps the time is ripe to provide "modern" base classes: rename old STRING as ASCII_STRING and make STRING deferred, handling handle UNICODE, with UTF-8, 16 and 32 implementations, even if this means to use a C library underneath :)

Kind regards,


reply via email to

[Prev in Thread] Current Thread [Next in Thread]