groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [groff] address@hidden: mom: PDF Author, pdfmom: needs C locale?]


From: Ralph Corderoy
Subject: Re: [groff] address@hidden: mom: PDF Author, pdfmom: needs C locale?]
Date: Fri, 09 Mar 2018 16:09:35 +0000

Hi Deri,

> I've got an example which is meant to show the problem (camus.mom),
> but unfortunately I can't make it generate the error which others are
> seeing.  Camus.mom is a utf-8 file and I have used -k in a utf-8 user
> account (LC_CTYPE=en_GB.UTF-8) and with -Kutf8 in an old style account
> (LC_CTYPE=en_GB), neither produced an error from grep.
>
> This leads me to suspect there is something in my version of grep
> which "understands" that UTF-8 files are not binary data.

That seems unlikely.  grep thinks files are binary if they contain ASCII
NUL, or have a byte sequence that's invalid for the locale, and it only
emits that `Binary file ... matches' if such a line matches the regexp.

Does your grep behave like this?  I used a UTF-8 terminal.

    $ od -tx1z <<<$'x\xa0\xa0y'
    0000000 78 a0 a0 79 0a                                   >x..y.<
    0000005
    $ LC_ALL=en_GB.utf8 grep z <<<$'x\xa0\xa0y'
    $ LC_ALL=en_GB.utf8 grep z <<<$'x\xa0\xa0y\nz'
    z
    $ LC_ALL=en_GB.utf8 grep y <<<$'x\xa0\xa0y'
    Binary file (standard input) matches
    $ LC_ALL=en_GB.iso88591 grep y <<<$'x\xa0\xa0y'
    x��y
    $ LC_ALL=C grep y <<<$'x\xa0\xa0y'
    x��y
    $

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy



reply via email to

[Prev in Thread] Current Thread [Next in Thread]