coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: cut with multibyte support for delimiter


From: Assaf Gordon
Subject: Re: cut with multibyte support for delimiter
Date: Mon, 18 Sep 2017 20:32:06 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0

Hello Sebastián,

On 2017-09-18 08:25 AM, Sebastian Kisela wrote:
> I implemented cut functionality with multibyte delimiter (cut -d'\unicode'
> -f )

Thank you for following up with a patch.
This is a great start.

I have only briefly looked at your patch, and will look more closely in
a few days. Few cursory invocations worked well for me on Ubuntu 16.04.

One thing I noticed is that the tests fail on my computer.
I see things like:
====
  cut-multibyte.pl: test mbd-newline-24: stdout mismatch, comparing
        mbd-newline-24.2 (expected) and mbd-newline-24.O (actual)
  *** mbd-newline-24.2    Mon Sep 18 20:02:46 2017
  --- mbd-newline-24.O    Mon Sep 18 20:02:46 2017
  ***************
  *** 1 ****
  ! aꝤb
  --- 1 ----
  ! a$Ꝥb
====
It is the extra dollar sign before the multibyte character which hints
to me it is related to the interaction between Perl
(which converts \xNN sequences) and the shell command line
(where you've used the $'\xNN' syntax).

The test was:
   ['mbd-newline-24', "-d'\n'", '-f1,2', "--ou=\$'\xEA\x9D\xA4'",
        {IN=>"a\nb\n"}, {OUT=>"a\xEA\x9D\xA4b\n"}],


Also,
I'm not sure if coreutils currently allows the newer $'\xNN' construct
in tests - this might be too new to be supported everywhere (comments,
anyone? I'll also try to look for them in other tests).

In any case, Perl itself can easily generate UTF-8 characters and send
them as-is to the program being tested, I think that will suffice.

----

Planning ahead, since this is going to be a large addition,
we'll need to ask you for copyright assignment for your code contributions.

You can read more about it here:
  https://www.gnu.org/licenses/why-assign.en.html
  https://www.fsf.org/licensing/assigning.html

To begin the process, please fill the information here:
https://git.savannah.gnu.org/cgit/gnulib.git/plain/doc/Copyright/request-assign.future

and send it to address@hidden .
Within few days you will receive a PDF document (with the information
you submitted), which you can then sign and return to the FSF.


regards,
 - assaf











reply via email to

[Prev in Thread] Current Thread [Next in Thread]