[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Codespell report for "gawk" (on fossies.org)
From: |
Fossies Administrator |
Subject: |
Re: Codespell report for "gawk" (on fossies.org) |
Date: |
Wed, 15 Jan 2020 13:39:06 +0100 (CET) |
User-agent: |
Alpine 2.21 (LSU 202 2017-01-01) |
Hi Wolfgang,
This is a remarkable achievement, an important step towards product quality. My
observations from a quick
look:
One might not care much about the spelling in ChangeLog files but the .texi and
the .hlp files are part of the
UI and deserve primary attention.
eles else gawk-5.0.1/awk is (perhaps) a typo, but not as indicated in the report.
"eles" is a somewhat unlucky
abbreviation for "elements" ("elems" would be better).
Yes, about the word "eles" I stumbled also, found the suggested "else"
erroneous but since I had no real idea what would be the correct word
I let it unchanged. Thanks for your interesting explanation.
The finding regarding "ther" is also interesting: "ther is a 3
combinations" indicates a major error, not just a typo. It's risky to
use the findings for an automated correction, especially where the
length of the bad word is less than six characters or where there is
more than one potential replacement.
Exactly. Especially variable and class names, URLs, mail addresses and
words that directly "touch" some kinds of delimiters or escape characters
may be problematic, some spelling corrections may be ambiguous, some words
can be deliberately misspelled and some corrections may have unwanted side
effects. A vague idea was to make the spelling errors selectable and then
to create multiple (or even one big) unidiff file(s) but that isn't
trivial to realize and for e.g. probably not usable by projects using
GitHub.
On the other hand, it's a trivial exercise to convert the relevant
section of fossies.org/linux/misc/gawk-5.0.1.tar.xz/codespell.html into
an awk script to edit all the occurrences into the source code base and
would be well worth the effort.
The word "list" in the phrase "Below the sortable list of all 249 found
occurrences of the 147 spelling error types ..." is a link to an according
list of the errors in the original test format, it looks for e.g. like
gawk-5.0.1/awk.h:260: eles ==> else
gawk-5.0.1/field.c:1552: succesful ==> successful
gawk-5.0.1/io.c:907: non-existant ==> non-existent
...
The use of that file may helpful for some according correcting scripts.
I don't see the reason for exempting "leapyear".
Hmm, since I found that word often used as variable and I have the
impression it is also in missing_d/mktime.c (lines 177 and 387)
not a typo?
Good work!
Wolfgang
PS: It's interesting that the authors of Fossies overlook (not "oversee") errors in
their language: "Here the
36 top most of a total..." should read "Here are the 36 topmost of a total..."
;-)
Ah, since the author isn't a native but a pretty bad English speaker
(a.o. https://dict.leo.org/ and Google are his helpers).
Therefore I'm very grateful for such hints, thanks! It's now corrected.
Regards
Jens
On Wed, 15 Jan 2020 at 03:03, Fossies Administrator <address@hidden> wrote:
Hi "gawk"-team,
the FOSS server fossies.org - also supporting "gawk" - offers a new
feature "Source code misspelling reports":
https://fossies.org/features.html#codespell
Such reports are normally only generated on request, but as Fossies
administrator I have just created (for testing purposes) an analysis for
the current "gawk" release 5.0.1:
https://fossies.org/linux/misc/gawk/codespell.html
That version-independent (not linked) URL should redirect always to the
last report (if available), so currently to
https://fossies.org/linux/misc/gawk-5.0.1.tar.xz/codespell.html
Although after a first review some obviously wrong matches ("false
positives") are already filtered out (ignored) please inform me if you
find more of them so that I can force a new improved check if applicable.
Just for information there are also two supplemental pages
https://fossies.org/linux/misc/gawk/codespell_conf.html
showing some used "codespell" configurations and
https://fossies.org/linux/misc/gawk/codespell_fps.html
showing all resulting obvious "false positives".
Although many of the found errors are in ChangeLog files or a "test"
directory I hope that the report can be nevertheless a little bit helpful.
If appropriate, additional reports for the current or future development
versions could be created (but in a special "test" folder that isn't
integrated in the Fossies standard services and should - at least
principally - also not be accessible by search engines).
Regards
Jens
--
FOSSIES - The Fresh Open Source Software archive
mainly for Internet, Engineering and Science
https://fossies.org/
- Re: Codespell report for "gawk" (on fossies.org), (continued)
- Re: Codespell report for "gawk" (on fossies.org), arnold, 2020/01/15
- Re: Codespell report for "gawk" (on fossies.org), arnold, 2020/01/15
- Re: Codespell report for "gawk" (on fossies.org), Wolfgang Laun, 2020/01/15
- Re: Codespell report for "gawk" (on fossies.org), arnold, 2020/01/15
- Re: Codespell report for "gawk" (on fossies.org), Koichi Murase, 2020/01/15
- Re: Codespell report for "gawk" (on fossies.org), Fossies Administrator, 2020/01/15
- Re: Codespell report for "gawk" (on fossies.org), arnold, 2020/01/15
- Re: Codespell report for "gawk" (on fossies.org), Koichi Murase, 2020/01/15
- Re: Codespell report for "gawk" (on fossies.org), Wolfgang Laun, 2020/01/15
Re: Codespell report for "gawk" (on fossies.org),
Fossies Administrator <=