bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/4] Cross compiling sharutils


From: Filipe Brandenburger
Subject: Re: [PATCH 0/4] Cross compiling sharutils
Date: Thu, 21 May 2015 21:50:55 -0700

Hi Eli,

On Thu, May 21, 2015 at 9:12 AM, Eli Zaretskii <address@hidden> wrote:
> It was I who added the binary mode to some popen calls in Sharutils.
> I did that when I worked on the MinGW port of Sharutils.  The reason
> is simple: you _must_ read output from a compressor in binary mode,
> because what the compressor outputs is binary data, not text.  If you
> read it in text mode, the read will stop on the first ^Z character,
> and it will strip CR characters from the data.  (I'm guessing that the
> former doesn't happen in Cygwin, so you don't see problems because
> your examples, by sheer luck, didn't have CR characters in the
> compressed data, or maybe didn't have them before an LF character.)

Ok, so I spent some time doing some tests to understand how this works...

TL;DR: cygwin doesn't seem to need popen("rb") though mingw definitely does.

Even though cygwin does differentiate "r" and "rb", it my tests it
seemed to be using binary by default. Not sure if that's a cygwin
environment setting or if it was detecting that gzip output "looked"
binary and was using binary that way...

I set up an experiment generating random strings, writing them to
files and calling popen of "gzip -c -9" on those files.

For example, this particular random string:

  "3pgcYgX yir4aRb mlWU2Zu lt1LO80 URPM6TO UZzj5Pb 6wdgIZ0 yMiH59i I57sMM2\r\n"

Compressing it with "gzip -c -9" will produce contents that will
contain a ^Z, a CRLF and a bare CR.

In cygwin, both popen(cmd, "r") and popen(cmd, "rb") returned the
exact same output, including the unmodified ^Z, CRLF and bare CR
characters.

Used popen(cmd, "rt") in Cygwin would indeed convert th CRLF into an
LF (though it would not choke on the ^Z or touch the bare CR), so I
could get some data corruption but I had to ask for it with the
explicit "t" modifier.

I tried the same tests with random strings on binaries built on mingw.
Indeed, in that environment, I saw exactly what you described. With
popen(cmd, "r"), a ^Z did truncate the input and a CRLF would be
converted into a bare LF. With popen(cmd, "rb"), none of that
happened.

I tried to go deeper in the mingw experiment, by actually building
sharutils with a modified popen without the "b" modifier. But I didn't
really get to build it in mingw/msys. I can run ./configure fine, but
make chokes when compiling lib/idcache.c that wants to #include
<pwd.h> which is not there (among other problems, it's not just this
one.) I'm not sure whether a recent change broke the mingw build of
sharutils or whether I'm making a mistake while trying to build it
myself. But at this point, looks like diminshing returns, so I'll give
myself for satisfied with the experiment on mingw calling popen on
random strings.

In any case, this all seems to corroborate that mingw does indeed need
the "b" modifier and that using the check for O_BINARY being defined
to a non-zero value is a good way to determine whether to pass the "b"
to popen and fopen calls or not.

Do you still have any objections to this O_BINARY approach?

Cheers,
Filipe



reply via email to

[Prev in Thread] Current Thread [Next in Thread]