bug-gzip
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#48680: "zgrep -q" failing with some large files


From: David Yoder
Subject: bug#48680: "zgrep -q" failing with some large files
Date: Wed, 26 May 2021 18:02:25 +0000

I've run into a problem with zgrep -q. On some large bz2 compressed files it 
returns false/error for a search that should have returned true. bzgrep and 
"bzip2 -cd | grep -q" both work as expected.

Either "-q or -l" are required to show the problem. I suspect that grep is 
terminating at the first match and sending SIGPIPE to bzgrep. But I don't know 
why the behavior is different in zgrep and "bzip2 -cd <file> | grep -q".

There is some minimum size for the compressed file to show the problem. The 
attached file is the shortest file with which I could duplicate the problem.
It is 1024 lines of this bzip2'ed:
The quick brown fox jumped over the lazy dog. The quick brown fox jumped over 
the lazy dog. The quick brown fox jumped over the lazy dog. The quick brown fox 
jumped over the lazy dog. The quick brown fox jumped over the lazy dog. The 
quick brown fox jumped over the lazy dog. The quick brown fox jumped over the 
lazy dog. The quick brown fox jumped over the lazy dog.

Here is a failing zgrep that should have returned true (0) but instead returns 
false (141).
>zgrep -q fox synthetic.log.bz2 ; echo $?
141
vl-dyoder-ecdca:~>zgrep --version
zgrep (gzip) 1.10
Copyright (C) 2010-2018 Free Software Foundation, Inc.
This is free software.  You may redistribute copies of it under the terms of
the GNU General Public License https://www.gnu.org/licenses/gpl.html.
There is NO WARRANTY, to the extent permitted by law.

Written by Jean-loup Gailly.


This also fails when used within find and prints a bit more information about 
the signal, SIGPIPE:
>find . -maxdepth 1 -name synthetic.log.bz2 -exec zgrep -q fox {} \; -print
find: 'zgrep' terminated by signal 13


Here are two examples showing expected behavior. Both bzgrep and "bzip2 -cd | 
grep -q" return true (0).
>bzgrep -q fox synthetic.log.bz2 ; echo $?
0
>bzip2 -cd synthetic.log.bz2 | grep -q fox; echo $?
0

Also, the identical file either uncompressed or compressed with gzip works as 
expected with zgrep:
>zgrep -q fox synthetic.log; echo $?
0
>zgrep -q fox synthetic.log.gz; echo $?
0

Zgrep seems to use a more complicated version of "bzip2 -cd <file> | grep", 
which works as expected. So perhaps the rather complicated pipe operations in 
zgrep are related. If so perhaps the shell I'm using matters:
>bash --version
GNU bash, version 4.3.48(1)-release (x86_64-suse-linux-gnu)
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Attachment: synthetic.log.bz2
Description: synthetic.log.bz2


reply via email to

[Prev in Thread] Current Thread [Next in Thread]