bug-textutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Diff to fix wc's definition "word" to match other GNU tools.


From: John W. Millaway
Subject: Diff to fix wc's definition "word" to match other GNU tools.
Date: Tue, 17 Apr 2001 14:16:35 -0700 (PDT)

Hello,

Below is a diff to modify 'wc' in order to fix wc's concept of a "word" to match
the definition of "word" used by egrep(1), regex(7), regcomp(3), perl, and other
tools. Specifically, the definitition of a "word" is a sequence of alphanumeric 
or
underscore characters.

The wc tool, however, considers anything other than \n, \r, \t, \l, \v, and 
<space>
to be valid word-characters. For example, the text,
"The-rain+in(Spain)falls*mainly{on}the......plain", is only one word, according 
to
`wc'. The tools mentioned above will report nine words.

Good news! The fix is rediculously simple -- only a few lines! Here is the 
output
of cvs diff on the file, "textutils-XXX/src/wc.c" :

diff -r1.1.1.1 -r1.2
223c223,224
<             switch (*p++)
---
>             int c = *p++;
>             switch (c)
250c251,254
<                 in_word = 1;
---
>                 if( isalnum((c)) || c == '_' )
>                   in_word = 1;
>                 else
>                   goto word_separator;

-John Millaway


__________________________________________________
Do You Yahoo!?
Yahoo! Auctions - buy the things you want at great prices
http://auctions.yahoo.com/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]