bug-standards
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Circumstances in which ChangeLog format is no longer useful


From: Joseph Myers
Subject: Re: Circumstances in which ChangeLog format is no longer useful
Date: Mon, 31 Jul 2017 12:55:25 +0000
User-agent: Alpine 2.20 (DEB 67 2015-01-07)

On Fri, 28 Jul 2017, Alfred M. Szmidt wrote:

>    >    1. The package has a public version control system.
>    > 
>    >    (Rationale: this ensures people can see what changed, just as with
>    >    ChangeLogs, but can see *exactly* what changed rather than just the
>    >    brief descriptions.)
>    > 
>    > I think that rationale is incorrect, just because you have a public
>    > version control system does not mean that you can see what actually
>    > changed.  Going through multiple megabytes of diffs is not feasible,
>    > and searching for when something was renamed, added, removed, etc is
>    > something no tool is capable of providing.
> 
>    That's a function of a busy project and is the same whether you look at 
>    commit logs, diffs or ChangeLog messages.
> 
> That doesn't address how you go through a diff to see when a function
> was added/removed/renamed/moved/..., neither diff nor annotate do
> those things -- nor can they in the general case.  Think non-C

Check for appropriate regular expressions in the diffs.  In the simple 
sorts of cases that the ChangeLog format can readily describe, such 
regular expressions will also work reasonably reliably.  In more 
complicated cases, they may not, but the more complicated cases also 
aren't well-described in terms of individual named entities as required in 
ChangeLog format.

> language, weird configuration files, etc.  And they are still not

Non-C languages are also a case that often doesn't work well with 
ChangeLog format - consider e.g. a C++ member function where identifying 
it for the ChangeLog requires a long fully-qualified name, complete with 
argument types to identify which overload is being modified.

> available in binary packages or tarball releases.

Binary releases likely don't include the ChangeLogs anyway.  E.g. Ubuntu 
ships only the NEWS file with binary releases, not the ChangeLogs; 
likewise a CentOS system I have to hand.

I think of tarball releases as just being one output of the development 
process.  They are an essential output - to define immutably the contents 
of a particular version number, in the same way for everyone, so everyone 
can get that version and reproducibly get the same sources, do a 
reproducible build and get the same binaries - but I don't think we should 
expect much visibility into the development process from them.  
Understanding the development process requires many other sources of 
information, such as the version control history, the mailing list 
archives, the issue tracker, ....

>    I wouldn't object to shipping the version control history in tarballs, if 
>    necessary to stop having to write in the ChangeLog format (or having 
>    tarballs with and without the version control history).  
> 
> That would ballon the tarball so much that it would be unacceptable,
> emacs has a .git directory that is around 1.8G on my machine, glibc
> would be 400M with .git and all source files.

A freshly packed glibc checkout (git remote prune origin; git reflog 
expire --expire=now --all; git gc --prune=all --aggressive) takes about 
130 MB for the .git directory, and I end up with a .tar.xz of about 135 
MB.  Of course that includes all branches, and with just the history in 
the ancestry of the release in question it would be a bit smaller.  And 
the -with-history.tar.xz could be separate from the normal version, since 
it's just a backup of the VCS data in case the VCS data is otherwise 
somehow lost (which is a lot less likely with a distributed VCS than a 
centralized one).

>    But I believe that people wanting to look at the history are going
>    to check out the repository rather than attempting to get it from
>    tarballs.
> 
> I can only speak from my own experience, but I always persue the
> ChangeLog file first.  Only when a project is badly maintained do I go
> for the VCS.

I look at the VCS first, at least when a distributed VCS is in use so I 
don't need to wait for the VCS server for each log inspection.  I used to 
look at ChangeLogs, but that's now a pretty old-fashioned, GNU-specific 
appproach.

> I am having a hard time taking the "completely useless", "waste of
> time" argument seriously when the ChangeLog files (be it in VCS, or a
> file) are infact used to do exactly that: to understand how code moves
> in a project.  If they where so useless and waste full then most GNU
> projects would have abandoned them many many years ago, and yet we
> still use them very activley, even in gcc and glibc being just two
> examples.

They're created in GCC and glibc because the GNU Coding Standards require 
them, not because (in the context of having VCS history, mailing list 
archives, issue trackers, etc. to provide a much richer understanding of 
the development history relevant to any particular issue) they are useful 
for something not covered by those other sources of information.

> This is what I would have done, you have only 8 specific changes
> touching multiple files, there is no need to repeat them several times
> and one could even reduce this a bit further by merging the file lines
> into one.  You can even skip the "Likewise." part completely.

I don't think this is any better.  If anything it obscures the essential 
nature of the change, which is very much "take each relevant file, apply 
this class of fixes to it", where links between identifiers of the same 
name in different files are entirely incidental.

> Writing accurate, and descriptive ChangeLogs is just like writing
> accurate documentation, both can be wrong, but so can code.  This
> falls onto the maintainer to see that all things are good.  Just using
> VCS won't solve that.

The VCS history is automatically a completely accurate record of the 
history of the code in a way that the ChangeLogs aren't.

>    The point of the VCS is to be able to undo changes.  ChangeLog files, and 
>    the form of change description therein, are in no way a substitute for the 
>    VCS, and are essentially obsoleted by it.
> 
> The ChangeLog is for human consumption, to understand how the code was
> changed, VCS does not solve this, not everything is prettily managed

That understanding effectively requires the tools such as VCS, mailing 
lists, issue trackers etc. that give a rich structure to the history 
information.

For users, we have the NEWS file.  For people looking at how things 
changed at the development level, a natural process is: look at the VCS 
logs (describing changes logically rather than physically), then 
potentially delve into diffs, list archives, issue trackers, etc. for 
individual changes that seem of interest.

> Why do you think that the ChangeLog can't mention the above?  Or maybe

The problem isn't that it can't mention the logical nature of the change.  
The problem is that given the logical description, in the VCS log, and 
given the diffs themselves, in the VCS history, and given the mailing list 
archives, bug tracker, etc. that also form part of the development 
process, writing a second description of the change, decomposed into 
descriptions at the level of individual named entities in individual 
files, has net negative utility; any benefit where someone is interested 
in that very specific level of information (for a change that doesn't map 
well onto that level of information) is outweighed by the extra work 
involved in writing that description of extremely niche use, by the time 
it takes away from substative development and writing descriptions at the 
logical level, and by putting off free software developers because of the 
need to jump through this hoop not needed for non-GNU projects.

> better yet, since this is a bug fix in a BUGS file or similar.

glibc has automatically-generated lists of fixed bugs in each release 
created from Bugzilla data just before the release and inserted in the 
NEWS file.

> Neither of those are available in tarballs, nor might they be

As I said, I think tarballs are just one output of the development 
process.  It's not appropriate to expect that the rich data of the 
interactions involved on mailing lists, issue trackers, patch review 
systems, version control, etc. can be adequately serialized into or 
understood through a ChangeLog; a proper understanding of the development 
process requires using those other tools.

> available in distribution packages where putting up a copy of
> ChangeLog can be very useful as well even if you do not have access to

In practice distribution packages likely do not include ChangeLogs.

> source code.  The NEWS file does not describe code changes, only
> user-visible changes, so it is not very useful if you are infact
> looking for a problem in a new release.

The expectation is to look at VCS logs and use tools such as "git bisect".  
(Which could be used to look for e.g. when a function moved, if desired, 
not just to test properties of the compiled code.)

-- 
Joseph S. Myers
address@hidden



reply via email to

[Prev in Thread] Current Thread [Next in Thread]