coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Feature request: testline(tl) (RFC)


From: Pádraig Brady
Subject: Re: Feature request: testline(tl) (RFC)
Date: Tue, 09 Dec 2014 22:38:51 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0

On 09/12/14 22:20, V.Krishn wrote:
> 
> Hi,
> 
> Was reading about bloom filter,
> and came upon this example,
> 
> http://troydhanson.github.io/misc/bloom.html
> ------
> The bf test program
> 
> The program bf.c implements a Bloom filter. It can be used like,
> 
> ./bf -n 16 members.txt test.txt
> 
> Where the lines of members.txt are the true set members and the lines of 
> test.txt will be tested for membership. Varying n shows how the error rate 
> increases with smaller values of n.
> ------
> 
> Source: https://github.com/troydhanson/misc
> code: 
> https://raw.githubusercontent.com/troydhanson/misc/master/compression/bloom/bf.c
> 
> REQUEST:
> Wondering if a simple implementation to test lines could be added to coreutils
> Features:
> 1. report if some lines missing (option to print)
> 2. option to print found lines
> 3. option to print missing lines
> 4. ....more logic posible...
> 
> -------------
> Presently, I can achive the same using simple shell script by calling grep on 
> each line or using `comm`
> But believe that method using bloom should be faster and result in a uniq and 
> useful tool.
> 
> Please ignore or guide if any similar util already exists.
> 

Maybe we should keep the existing interfaces of grep, uniq, comm etc.
and use a bloom filter _internally_ if appropriate.

cheers,
Pádraig.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]