[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Question on "sort" command
From: |
Bob Proulx |
Subject: |
Re: Question on "sort" command |
Date: |
Tue, 25 Feb 2003 21:07:17 -0700 |
User-agent: |
Mutt/1.3.28i |
address@hidden wrote:
>
> I have some problems with the command sort :
> How can I specify the field seperator "\t" on sort command?
You are actually experiencing problems with the shell. This has been
discussed before. But gnu.org seems to have lost all of the mailing
list archives from before this year. A terrible loss if the data is
not recovered.
But if you google search for the following string and hit google's
cached copies you can read the previous thread of discussion about
this.
site:gnu.org tab as sort field-separator
But since the archive is down I will recreate the discussion here.
> I try to sort a file which have a field seperator : a tabulation.
> I wrote : cat toto.file | sort -t \t -k 2n > result.txt
> but the second field (numeric field) in the result.txt file is not sorted.
>
> I tried some others command line like these:
> cat toto.file | sort -t "\t" -k 2n > result.txt
> cat toto.file | sort -t"\t" -k 2n > result.txt
> cat toto.file | sort -t\t -k 2n > result.txt
> cat toto.file | sort -t=\t -k 2n > result.txt
> cat toto.file | sort -t="\t" -k 2n > result.txt
Use 'echo' to see what you are telling sort to do.
echo sort -t"\t"
sort -t\t
sort is not seeing a tab, it is seeing a backslash which is not the
same thing at all. Getting the tab into the string with current
shells is tricky.
Here is what the bash manual says:
There are three quoting mechanisms: the escape character,
single quotes, and double quotes.
A non-quoted backslash (\) is the escape character. It
preserves the literal value of the next character that
follows, with the exception of <newline>. If a \<newline>
pair appears, and the backslash is not itself quoted, the
\<newline> is treated as a line continuation (that is, it
is removed from the input stream and effectively ignored).
Enclosing characters in single quotes preserves the lit
eral value of each character within the quotes. A single
quote may not occur between single quotes, even when pre
ceded by a backslash.
Enclosing characters in double quotes preserves the lit
eral value of all characters within the quotes, with the
exception of $, `, and \. The characters $ and ` retain
their special meaning within double quotes. The backslash
retains its special meaning only when followed by one of
the following characters: $, `, ", \, or <newline>. A
double quote may be quoted within double quotes by preced
ing it with a backslash.
The bash manual specifically says that "\t" is nothing special since
it is not one of the listed sequences. This was backed up by the
SUSv2 specification as well.
Previous suggestions go like this. If you want a maximally portable
solution, use awk. Paul Eggert suggested this:
tab=`awk 'BEGIN {print "\t"; exit}'`
sort -t"$tab"
But I am okay with cutting loose machines with operating systems prior
to 1992 when printf first appeared. Therefore I use printf for a
slightly simpler solution.
tab=$(printf "\t")
sort -t"$tab"
Or I suppose you could combine them into a one-liner. But people
reading your script later will hurt you for it.
sort -t"$(printf "\t")"
Paul Eggert wrote on this subject:
> It is POSIX standard and it is fairly safe nowadays, but it won't work
> on older hosts. I believe the "printf" command was first standardized
> by XPG4 (dated 1992), and many older hosts do not have it. In
> contrast, the solution with Awk should work all the way back to Unix
> Version 7 (dated 1978). But if you're using $(...) instead of `...`
> then I guess you're not worried about older hosts anyway....
>
> On Solaris 9 "printf" is part of the SUNWloc package, and this package
> is occasionally not installed on some bare-bones Solaris hosts; e.g. see
> <http://groups.google.com/groups?selm=38F5B805.E1026EBC%40ks.sel.alcatel.de>.
> In contrast, "awk" is in SUNWesu, which is almost always installed.
Hope that helps...
Bob