bug-sed
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#35993: Windows port redirection bug


From: Assaf Gordon
Subject: bug#35993: Windows port redirection bug
Date: Sat, 1 Jun 2019 12:54:52 -0600
User-agent: Mutt/1.11.4 (2019-03-13)

tags 35993 notabug
close 35993
stop

Hello,

On Wed, May 29, 2019 at 03:53:06PM +0100, address@hidden wrote:
> The bug is reproducible with this command: SED 'S/"//G' TESTE.TXT > OUT.TXT
>
> It should remove double quotes and save the result in the out.txt file. With
> a Bash shell it works as expected, but under Windows 10's command line it
> prints the resulting output and issues this error:
>  
> SED: CAN'T READ >: NO SUCH FILE OR DIRECTORY
> SED: CAN'T READ OUT.TXT: NO SUCH FILE OR DIRECTORY
>
> Escaping the double quote doesn't change the result, but if I use another
> character instead, like SED 'S/X//G' TESTE.TXT > OUT.TXT it works.
>

This is not a bug in sed - just incorrect usage of quotes in the Windows
command prompt (CMD.EXE).

Before going into (long) details, here's the solution:

    c:\Users\gordon\Desktop> type teste.txt
    hello"world

    c:\Users\gordon\Desktop> sed-4.7-64bit.exe "s/\"/XXX/g" teste.txt
    helloXXXworld

or even:

    c:\Users\gordon\Desktop> sed-4.7-64bit.exe s/\"/XXX/g teste.txt
    helloXXXworld


Now some details:

1.
Single-quotes have special meaning AT ALL in cmd.exe.
There's no point using them. In fact, they will just cause more
problems, as they are passed as-is to the sed program, and sed will
complain that a single-quote is not a recognizable sed command:

   c:\Users\gordon\Desktop> sed-4.7-64bit.exe '
   sed-4.7-64bit.exe: -e expression #1, char 1: unknown command: `''


2.
Double-quotes DO NOT behave like you expect if you are
familiar with unix-style shell quoting.

In unix-world, quotes (both single and double) act in pairs (opening
and closing), and we can speak about "text inside single/double quotes" and
"text outside quotes".

A side-effect of this approach is that strings with unbalanced quotes
result in parsing error. E.g. the following is not a complete/valid
command in unix:

     echo hello"world

3.
In windows' CMD.EXE, double-quote character (ONE character) changes an
internal parsing state which controls whether special characters are
important or ignored (surprising/unintuitive if you're coming from unix
world). a SPACE character is the most common example of special
characters.

For example, in CMD.EXE the following is valid command:

    echo foo > hello" world.txt

And it will create a file named HELLO<SPACE>WORLD<DOT>TXT .

The above string is parsed like so:
    1. 'hello' - as is
    2. double-quote - turns on "special character handling" state.
    3. space character - kept (not ignored) because of the new state.
    4. 'world.txt' - as is.

Another example, the following two commands are valid in CMD.EXE.
In the second command, once a double-quote character is encountered,
The PIPE character (loses its special meaning and is just consumed as
part of the string):

    c:\Users\gordon\Desktop\a>echo "hello world" | more
    "hello world"


    c:\Users\gordon\Desktop\a>echo "hello | more
    "hello | more


4.
For more strange cases, try the following:

    c:\Users\gordon\Desktop\c> echo foo > hello.txt
    c:\Users\gordon\Desktop\c> echo foo > "hello world.txt"
    c:\Users\gordon\Desktop\c> echo foo > hello" w o r l d.txt
    c:\Users\gordon\Desktop\c> echo foo > hello"   world.txt
    c:\Users\gordon\Desktop\c> echo foo > 'hello world.txt'

    c:\Users\gordon\Desktop\c> dir
     Volume in drive C is OS
     Volume Serial Number is 4CA3-CC48

     Directory of c:\Users\gordon\Desktop\c

    06/01/2019  12:50 PM    <DIR>          .
    06/01/2019  12:50 PM    <DIR>          ..
    06/01/2019  12:50 PM                17 'hello
    06/01/2019  12:50 PM                 6 hello   world.txt
    06/01/2019  12:50 PM                 6 hello w o r l d.txt
    06/01/2019  12:49 PM                 6 hello world.txt
    06/01/2019  12:49 PM                 6 hello.txt
                   5 File(s)             41 bytes
                   2 Dir(s)  384,140,804,096 bytes free

5.
To go even deeper into the nitty-gritty of CMD.EXE parsing and quoting,
see this interesting blog post:
http://www.windowsinspired.com/understanding-the-command-line-string-and-arguments-received-by-a-windows-program/


As such, I'm closing this as "not a bug", but discussion can continue
by replying to this thread.

regards,
 - assaf


P.S.
A newer version of SED (version 4.7) was released in December 2019,
and it contains few minor fixes/changes to behaviour on windows.
See here on how to build and/or download the binaries:

  https://lists.gnu.org/archive/html/sed-devel/2018-12/msg00031.html






reply via email to

[Prev in Thread] Current Thread [Next in Thread]