bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#20287: comm does not imply uniq


From: anti plex
Subject: bug#20287: comm does not imply uniq
Date: Thu, 9 Apr 2015 19:21:54 +0200

Hi there,

This is not meant to be a bug in the classic sense but rather a minor suggestion for a manpage-improvement.

Albeit being a Linux-user for some years I've just re-discovered comm to compare two files with md5-hashes. Some entries (lines) in one file occurred twice wile being present zero or one time in the other file which lead to weird results in combination with e.g. '-23' as i expected the output to be exclusive to file1.
The repetitive use of the word 'unique' in the manpage ('Column one contains lines unique to FILE1, column two contains lines unique to FILE2, and column three contains lines common to both files.') further lured me to think in this direction.

Thinking further about the behaviour of comm in the light of "Compare sorted files FILE1 and FILE2 line by line." being mentioned in the manpage one could induce that duplicate lines will be seen as differences of course. Still, some additional hint just like mentioning that comm expects each file to be sorted would have helped me to avoid some headaches ;)

Therefore I'd like to suggest to add a short hint such as 'Note that comm does not imply checking for repetitive lines in either file so consider some form of uniq-ification if expecting entries in each output-column to be exclusive'.

Thanks for providing such a wonderful toolset,
regards, antiplex


reply via email to

[Prev in Thread] Current Thread [Next in Thread]