bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Codespell report for "gawk" (on fossies.org)


From: Fossies Administrator
Subject: Re: Codespell report for "gawk" (on fossies.org)
Date: Wed, 15 Jan 2020 13:39:06 +0100 (CET)
User-agent: Alpine 2.21 (LSU 202 2017-01-01)

Hi Wolfgang,

This is a remarkable achievement, an important step towards product quality. My 
observations from a quick
look:

One might not care much about the spelling in ChangeLog files but the .texi and 
the .hlp files are part of the
UI and deserve primary attention.

eles else gawk-5.0.1/awk is (perhaps) a typo, but not as indicated in the report. 
"eles" is a somewhat unlucky
abbreviation for "elements" ("elems" would be better).

Yes, about the word "eles" I stumbled also, found the suggested "else" erroneous but since I had no real idea what would be the correct word
I let it unchanged. Thanks for your interesting explanation.

The finding regarding "ther" is also interesting: "ther is a 3 combinations" indicates a major error, not just a typo. It's risky to use the findings for an automated correction, especially where the length of the bad word is less than six characters or where there is more than one potential replacement.

Exactly. Especially variable and class names, URLs, mail addresses and words that directly "touch" some kinds of delimiters or escape characters may be problematic, some spelling corrections may be ambiguous, some words can be deliberately misspelled and some corrections may have unwanted side effects. A vague idea was to make the spelling errors selectable and then to create multiple (or even one big) unidiff file(s) but that isn't trivial to realize and for e.g. probably not usable by projects using GitHub.

On the other hand, it's a trivial exercise to convert the relevant section of fossies.org/linux/misc/gawk-5.0.1.tar.xz/codespell.html into an awk script to edit all the occurrences into the source code base and would be well worth the effort.

The word "list" in the phrase "Below the sortable list of all 249 found occurrences of the 147 spelling error types ..." is a link to an according list of the errors in the original test format, it looks for e.g. like

 gawk-5.0.1/awk.h:260: eles  ==> else
 gawk-5.0.1/field.c:1552: succesful  ==> successful
 gawk-5.0.1/io.c:907: non-existant  ==> non-existent
 ...

The use of that file may helpful for some according correcting scripts.

I don't see the reason for exempting "leapyear".

Hmm, since I found that word often used as variable and I have the impression it is also in missing_d/mktime.c (lines 177 and 387)
not a typo?

Good work!
Wolfgang

PS: It's interesting that the authors of Fossies overlook (not "oversee") errors in 
their language: "Here the
36 top most of a total..." should read "Here are the 36 topmost of a total..." 
;-)

Ah, since the author isn't a native but a pretty bad English speaker
(a.o. https://dict.leo.org/ and Google are his helpers).

Therefore I'm very grateful for such hints, thanks! It's now corrected.

Regards

Jens

On Wed, 15 Jan 2020 at 03:03, Fossies Administrator <address@hidden> wrote:
      Hi "gawk"-team,

      the FOSS server fossies.org - also supporting "gawk" - offers a new
      feature "Source code misspelling reports":

        https://fossies.org/features.html#codespell

      Such reports are normally only generated on request, but as Fossies
      administrator I have just created (for testing purposes) an analysis for
      the current "gawk" release 5.0.1:

        https://fossies.org/linux/misc/gawk/codespell.html

      That version-independent (not linked) URL should redirect always to the
      last report (if available), so currently to

        https://fossies.org/linux/misc/gawk-5.0.1.tar.xz/codespell.html

      Although after a first review some obviously wrong matches ("false
      positives") are already filtered out (ignored) please inform me if you
      find more of them so that I can force a new improved check if applicable.

      Just for information there are also two supplemental pages

        https://fossies.org/linux/misc/gawk/codespell_conf.html

      showing some used "codespell" configurations and

        https://fossies.org/linux/misc/gawk/codespell_fps.html

      showing all resulting obvious "false positives".

      Although many of the found errors are in ChangeLog files or a "test"
      directory I hope that the report can be nevertheless a little bit helpful.

      If appropriate, additional reports for the current or future development
      versions could be created (but in a special "test" folder that isn't
      integrated in the Fossies standard services and should - at least
      principally - also not be accessible by search engines).

      Regards

      Jens

      --
      FOSSIES - The Fresh Open Source Software archive
      mainly for Internet, Engineering and Science
      https://fossies.org/


reply via email to

[Prev in Thread] Current Thread [Next in Thread]