bug-textutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

join (2.0.1+) doesn't work properly


From: Christian Ohr
Subject: join (2.0.1+) doesn't work properly
Date: Wed, 12 Dec 2001 17:01:49 +0100

Hi,

recently I needed to merge some really big text files (several million
lines altogether, the result file was >1GB), joining on the first column
(which is a MD5 base64 hash key) and printing both unique lines and
concatenated joined lines. The command looked like: 

join -a 1 -a 2 -t \t file1 file2 > file3

Since I had about 20 or more files to merge, I reduced their number by
subsequent pairwise joining until there was only one result file left
(would be nice to have something like multi-way merging available
here...).
However, joining only worked correctly with textutils 2.0a; later
versions (e.g. 2.0.13, 2.0.16) did leave some duplicate keys. Running
'diff' or 'wc -l' on the sorted and later unified keys identified some
100 lines difference...
I can reproduce this, however, I haven't identfied what goes wrong nor
did I dive into the sources ... maybe I can have a closer look at this
someday...

regards
Christian



reply via email to

[Prev in Thread] Current Thread [Next in Thread]