[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: tab as sort's field-separator
From: |
Bob Proulx |
Subject: |
Re: tab as sort's field-separator |
Date: |
Mon, 17 Jun 2002 17:33:20 -0600 |
User-agent: |
Mutt/1.3.28i |
> > Meanwhile, try this:
> >
> > sort -t"\t" -u -k2,2 -k1,1 /tmp/x
> BTW, your suggestion of using "\t" didn't work.
That worked for me using some versions of bash but not other
versions. It looks like something that has been fixed to be standards
conforming.
I decided to RTFM a little. Please read along with me and see what we
can learn. The bash manual says this:
There are three quoting mechanisms: the escape character,
single quotes, and double quotes.
A non-quoted backslash (\) is the escape character. It
preserves the literal value of the next character that
follows, with the exception of <newline>. If a \<newline>
pair appears, and the backslash is not itself quoted, the
\<newline> is treated as a line continuation (that is, it
is removed from the input stream and effectively ignored).
Enclosing characters in single quotes preserves the lit
eral value of each character within the quotes. A single
quote may not occur between single quotes, even when pre
ceded by a backslash.
Enclosing characters in double quotes preserves the lit
eral value of all characters within the quotes, with the
exception of $, `, and \. The characters $ and ` retain
their special meaning within double quotes. The backslash
retains its special meaning only when followed by one of
the following characters: $, `, ", \, or <newline>. A
double quote may be quoted within double quotes by preced
ing it with a backslash.
The bash manual specifically says that "\t" is nothing special since
it is not one of the listed sequences. This was backed up by the
SUSv2 specification as well.
The special parameters * and @ have special meaning when
in double quotes (see PARAMETERS below).
Words of the form $'string' are treated specially. The
Aha! This is what we need.
word expands to string, with backslash-escaped characters
replaced as specifed by the ANSI C standard. Backslash
escape sequences, if present, are decoded as follows:
\a alert (bell)
\b backspace
\e an escape character
\f form feed
\n new line
\r carriage return
\t horizontal tab
\v vertical tab
\\ backslash
\' single quote
\nnn the eight-bit character whose value is the
octal value nnn (one to three digits)
\xHH the eight-bit character whose value is the
hexadecimal value HH (one or two hex digits)
The expanded result is single-quoted, as if the dollar
sign had not been present.
A double-quoted string preceded by a dollar sign ($) will
cause the string to be translated according to the current
locale. If the current locale is C or POSIX, the dollar
sign is ignored. If the string is translated and
replaced, the replacement is double-quoted.
Let's try what the manual suggests:
sort -t$'\t'. -u -k2,2 -k1,1 /dev/null
That looks good so far and works with bash, hpux /bin/sh, and aix
/bin/sh which I tested this out on. If three different sources do
something the same way then there must be a reason.
I perused the standards documentation for the shell here.
http://www.opengroup.org/onlinepubs/007908799/xcu/chap2.html
But unfortunately I did not find anything that required this. Perhaps
I missed it and someone can point me to the relevant passages.
Bob
- tab as sort's field-separator, Jim Fohlin, 2002/06/15
- Re: tab as sort's field-separator, Andrew D Jewell, 2002/06/16
- Re: tab as sort's field-separator, jcf, 2002/06/17
- Re: tab as sort's field-separator, Bob Proulx, 2002/06/17
- Re: tab as sort's field-separator, jcf, 2002/06/17
- Re: tab as sort's field-separator, Andrew D Jewell, 2002/06/17
- Re: tab as sort's field-separator,
Bob Proulx <=
- Re: tab as sort's field-separator, Bob Proulx, 2002/06/18
- Re: tab as sort's field-separator, Paul Eggert, 2002/06/18
Re: tab as sort's field-separator ... works fine given a tab, Jim Fohlin, 2002/06/19