coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: multibyte support (round 4) - tr


From: Assaf Gordon
Subject: Re: multibyte support (round 4) - tr
Date: Sat, 23 Dec 2017 18:50:41 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0

Hello,

More progress on tr with multibyte support, available here:
https://files.housegordon.org/src/coreutils-multibyte-2017-12-23.patch.xz

translation (mostly) working:

   $ echo abcdefg | ./src/tr 'abcd' 'αβγδ'
   αβγδefg

   $ echo '1234 ABCD ΨΔΩΣ *$%()' \
              | ./src/tr -c '[:alpha:][:cntrl:]' 'Ψ'
   ΨΨΨΨΨABCDΨΨΔΩΣΨΨΨΨΨΨ

   $ echo 'αααββββ' | ./src/tr -s 'β' 'χ'
   αααχ

   $ echo 'aAbBcC ✀  χΧλΛσΣ' | ./src/tr '[:lower:]' '[:upper:]'
   AABBCC ✀  ΧΧΛΛΣΣ


The current implementation could be a starting point for
testing and discussing specific edge-cases (some tests are already included).

It is not tuned for efficiency (neither implementation nor run time performance).

There's a lot of code duplication due to keeping the entire current unibyte code-path intact.


comments welcomed.
 - assaf




reply via email to

[Prev in Thread] Current Thread [Next in Thread]