bug-sed
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#35993: Windows port redirection bug


From: nlweb
Subject: bug#35993: Windows port redirection bug
Date: Wed, 05 Jun 2019 16:27:46 +0100
User-agent: IMP PTMail 6.1.13

Hi Assaf,

These are my findings on Windows 10:
 
S:\temp>type teste.txt
hello"world

S:\temp>SED 's/"//g' teste.txt > out.txt
SED: can't read >: No such file or directory
SED: can't read out.txt: No such file or directory
helloworld

S:\temp>SED s/\"//g teste.txt > out.txt
SED: -e expression #1, char 26: unterminated `s' command

S:\temp>SED "s/\"//g" teste.txt > out.txt
SED: can't read >: No such file or directory
SED: can't read out.txt: No such file or directory
helloworld

S:\temp>sed-4.7-64bit s/\"//g teste.txt > out.txt
helloworldsed-4.7-64bit: can't read >: Invalid argument
sed-4.7-64bit: can't read out.txt: No such file or directory

S:\temp>sed-4.7-64bit "s/\"//g" teste.txt > out.txt
helloworldsed-4.7-64bit: can't read >: Invalid argument
sed-4.7-64bit: can't read out.txt: No such file or directory

As you can see, the problem persists. My goal is to remove double quotes, so I replace them with nothing. This command line works inside a bash shell, but gives the above errors under Windows.

Thanks!

Nuno
Citando Assaf Gordon <address@hidden>:

tags 35993 notabug
close 35993
stop

Hello,

On Wed, May 29, 2019 at 03:53:06PM +0100, address@hidden wrote:
The bug is reproducible with this command: SED 'S/"//G' TESTE.TXT > OUT.TXT

It should remove double quotes and save the result in the out.txt file. With
a Bash shell it works as expected, but under Windows 10's command line it
prints the resulting output and issues this error:
 
SED: CAN'T READ >: NO SUCH FILE OR DIRECTORY
SED: CAN'T READ OUT.TXT: NO SUCH FILE OR DIRECTORY

Escaping the double quote doesn't change the result, but if I use another
character instead, like SED 'S/X//G' TESTE.TXT > OUT.TXT it works.

This is not a bug in sed - just incorrect usage of quotes in the Windows
command prompt (CMD.EXE).

Before going into (long) details, here's the solution:

   c:\Users\gordon\Desktop> type teste.txt
   hello"world

   c:\Users\gordon\Desktop> sed-4.7-64bit.exe "s/\"/XXX/g" teste.txt
   helloXXXworld

or even:

   c:\Users\gordon\Desktop> sed-4.7-64bit.exe s/\"/XXX/g teste.txt
   helloXXXworld

Now some details:

1.
Single-quotes have special meaning AT ALL in cmd.exe.
There's no point using them. In fact, they will just cause more
problems, as they are passed as-is to the sed program, and sed will
complain that a single-quote is not a recognizable sed command:

  c:\Users\gordon\Desktop> sed-4.7-64bit.exe '
  sed-4.7-64bit.exe: -e expression #1, char 1: unknown command: `''

2.
Double-quotes DO NOT behave like you expect if you are
familiar with unix-style shell quoting.

In unix-world, quotes (both single and double) act in pairs (opening
and closing), and we can speak about "text inside single/double quotes" and
"text outside quotes".

A side-effect of this approach is that strings with unbalanced quotes
result in parsing error. E.g. the following is not a complete/valid
command in unix:

    echo hello"world

3.
In windows' CMD.EXE, double-quote character (ONE character) changes an
internal parsing state which controls whether special characters are
important or ignored (surprising/unintuitive if you're coming from unix
world). a SPACE character is the most common example of special
characters.

For example, in CMD.EXE the following is valid command:

   echo foo > hello" world.txt

And it will create a file named HELLO<SPACE>WORLD<DOT>TXT .

The above string is parsed like so:
   1. 'hello' - as is
   2. double-quote - turns on "special character handling" state.
   3. space character - kept (not ignored) because of the new state.
   4. 'world.txt' - as is.

Another example, the following two commands are valid in CMD.EXE.
In the second command, once a double-quote character is encountered,
The PIPE character (loses its special meaning and is just consumed as
part of the string):

   c:\Users\gordon\Desktop\a>echo "hello world" | more
   "hello world"

   c:\Users\gordon\Desktop\a>echo "hello | more
   "hello | more

4.
For more strange cases, try the following:

   c:\Users\gordon\Desktop\c> echo foo > hello.txt
   c:\Users\gordon\Desktop\c> echo foo > "hello world.txt"
   c:\Users\gordon\Desktop\c> echo foo > hello" w o r l d.txt
   c:\Users\gordon\Desktop\c> echo foo > hello"   world.txt
   c:\Users\gordon\Desktop\c> echo foo > 'hello world.txt'

   c:\Users\gordon\Desktop\c> dir
    Volume in drive C is OS
    Volume Serial Number is 4CA3-CC48

    Directory of c:\Users\gordon\Desktop\c

   06/01/2019  12:50 PM    <DIR>          .
   06/01/2019  12:50 PM    <DIR>          ..
   06/01/2019  12:50 PM                17 'hello
   06/01/2019  12:50 PM                 6 hello   world.txt
   06/01/2019  12:50 PM                 6 hello w o r l d.txt
   06/01/2019  12:49 PM                 6 hello world.txt
   06/01/2019  12:49 PM                 6 hello.txt
                  5 File(s)             41 bytes
                  2 Dir(s)  384,140,804,096 bytes free

5.
To go even deeper into the nitty-gritty of CMD.EXE parsing and quoting,
see this interesting blog post:
http://www.windowsinspired.com/understanding-the-command-line-string-and-arguments-received-by-a-windows-program/

As such, I'm closing this as "not a bug", but discussion can continue
by replying to this thread.

regards,
- assaf

P.S.
A newer version of SED (version 4.7) was released in December 2019,
and it contains few minor fixes/changes to behaviour on windows.
See here on how to build and/or download the binaries:
https://lists.gnu.org/archive/html/sed-devel/2018-12/msg00031.html

 


reply via email to

[Prev in Thread] Current Thread [Next in Thread]