[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Diff to fix wc's definition "word" to match other GNU tools.
From: |
John W. Millaway |
Subject: |
Diff to fix wc's definition "word" to match other GNU tools. |
Date: |
Tue, 17 Apr 2001 14:16:35 -0700 (PDT) |
Hello,
Below is a diff to modify 'wc' in order to fix wc's concept of a "word" to match
the definition of "word" used by egrep(1), regex(7), regcomp(3), perl, and other
tools. Specifically, the definitition of a "word" is a sequence of alphanumeric
or
underscore characters.
The wc tool, however, considers anything other than \n, \r, \t, \l, \v, and
<space>
to be valid word-characters. For example, the text,
"The-rain+in(Spain)falls*mainly{on}the......plain", is only one word, according
to
`wc'. The tools mentioned above will report nine words.
Good news! The fix is rediculously simple -- only a few lines! Here is the
output
of cvs diff on the file, "textutils-XXX/src/wc.c" :
diff -r1.1.1.1 -r1.2
223c223,224
< switch (*p++)
---
> int c = *p++;
> switch (c)
250c251,254
< in_word = 1;
---
> if( isalnum((c)) || c == '_' )
> in_word = 1;
> else
> goto word_separator;
-John Millaway
__________________________________________________
Do You Yahoo!?
Yahoo! Auctions - buy the things you want at great prices
http://auctions.yahoo.com/
- Diff to fix wc's definition "word" to match other GNU tools.,
John W. Millaway <=