[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Circumstances in which ChangeLog format is no longer useful
From: |
Joseph Myers |
Subject: |
Re: Circumstances in which ChangeLog format is no longer useful |
Date: |
Fri, 28 Jul 2017 23:47:19 +0000 |
User-agent: |
Alpine 2.20 (DEB 67 2015-01-07) |
On Fri, 28 Jul 2017, Alfred M. Szmidt wrote:
> 1. The package has a public version control system.
>
> (Rationale: this ensures people can see what changed, just as with
> ChangeLogs, but can see *exactly* what changed rather than just the
> brief descriptions.)
>
> I think that rationale is incorrect, just because you have a public
> version control system does not mean that you can see what actually
> changed. Going through multiple megabytes of diffs is not feasible,
> and searching for when something was renamed, added, removed, etc is
> something no tool is capable of providing.
That's a function of a busy project and is the same whether you look at
commit logs, diffs or ChangeLog messages.
> 2. The version control uses a distributed version control system.
>
> (Rationale: this ensures people can get a complete copy of the
> history of what changed, as they can with ChangeLog files in
> releases.)
>
> How would the information that is normally available in a ChangeLog
> file be populated if all that information is in the VCS? That would
> still be needed for normal tarballs and the like when VCS is out the
> window.
I wouldn't object to shipping the version control history in tarballs, if
necessary to stop having to write in the ChangeLog format (or having
tarballs with and without the version control history). Or straight
copies of the version control logs, as long as no-one actually has to
manually write the list of files, named entities within those files and
what is changed in each named entity (instead just having human-written
logs describing what changed at the logical level). But I believe that
people wanting to look at the history are going to check out the
repository rather than attempting to get it from tarballs.
> 3. Commits are made for each logical change, not batched into a
> commit per release or per day or other such batching.
>
> (Rationale: this ensures as much separation of logically separate
> changes as there would be in a ChangeLog file.)
>
> This is I think a good idea, the bunching of ChangeLog entries always
> feelt a bit weird.
I'd actually like points 1 and 3 (public VCS with logical commits) to be
required for all GNU packages, but that's independent of my present point.
> 5. Commit messages describe the logical "what" changed (but don't
> necessarily describe the physical "what" at the level of changes to
> individual files and functions).
>
> (Rationale: the logical "what" is useful information at the human
> level for understanding the change. Listing individual changed
> files and functions both duplicates the information available from
> the version control system, and is at the wrong level for
> understanding the change for most purposes.
>
> I am not sure I understand why it is wrong, to be able to understand
> how something came to be one needs to look at how things changed --
> and only way to do that is with a ChangeLog entry.
The only way to do that reliably is with the version control history.
Which is what people expect to use to look at how something came to be as
it is - they expect to check out a repository, not to look at a tarball
for that information.
> It's normal in glibc, for example, for a change to affect many
> separate files and named entities in those files, in ways that are
> repetitive but not repetitive enough to use e.g. "All callers
> changed", and which the ChangeLog format does not provide a good
> fit to or result in useful information about the changes not
> available from version control.)
>
> Knowing that all callers have been change is I think useful
> information, why do you think the opposite?
The point is that the changes are mechanical, but not in a way that
corresponds to "all callers changed", and that listing all the named
entities changed and how they changed is error-prone, time-consuming
(possibly taking longer than writing the patch itself) and results in a
ChangeLog entry that is completely useless for people wanting to
understand the change (who will want the description at the logical level,
and if they want the exact details for each named entity, will find the
version control history more useful).
Here's a representative example of a ChangeLog entry I wrote recently. I
think that given the logical description (summary line plus two
paragraphs) in the commit message, and given the commit history for anyone
interested in the exact details of how particular files or entities were
changed, writing this ChangeLog entry was a complete waste of time and it
provides nothing useful for anyone using or developing glibc. And this
sort of mostly-mechanical change, with many files and entities therein
changed in similar but not identical ways, is very common when working on
glibc; I've written a great many such ChangeLog entries, some much longer
than this one with hundreds of named entities enumerated as changed. And,
similarly, for GCC changes. Spending the time to write several paragraphs
at the human level about the content and purpose of a change is
worthwhile. Spending the time to duplicate, badly, the information in the
diff itself about changed files and entities therein is just an extra
unnecessary hoop to jump through when making a change.
2017-06-01 Joseph Myers <address@hidden>
[BZ #21457]
* sysdeps/arm/sys/ucontext.h (NGREG): Rename to __NGREG and define
NGREG to __NGREG if [__USE_MISC].
(gregset_t): Define using __NGREG.
(__ctx): New macro.
(mcontext_t): Use __ctx in defining fields.
* sysdeps/i386/sys/ucontext.h (NGREG): Rename to __NGREG and
define NGREG to __NGREG if [__USE_MISC].
(gregset_t): Define using __NGREG.
(__ctx): New macro.
(__ctxt): Likewise.
(fpregset_t): Use __ctx and __ctxt in defining fields.
(mcontext_t): Likewise.
* sysdeps/m68k/sys/ucontext.h (NGREG): Rename to __NGREG and
define NGREG to __NGREG if [__USE_MISC].
(gregset_t): Define using __NGREG.
(__ctx): New macro.
(mcontext_t): Use __ctx in defining fields.
* sysdeps/mips/sys/ucontext.h (NGREG): Rename to __NGREG and
define NGREG to __NGREG if [__USE_MISC].
(gregset_t): Define using __NGREG.
(__ctx): New macro.
(fpregset_t): Use __ctx in defining fields.
(mcontext_t): Likewise.
* sysdeps/unix/sysv/linux/alpha/sys/ucontext.h (NGREG): Rename to
__NGREG and define NGREG to __NGREG if [__USE_MISC].
(gregset_t): Define using __NGREG.
(NFPREG): Rename to __NFPREG and define NFPREG to __NFPREG if
[__USE_MISC].
(fpregset_t): Define using __NFPREG.
* sysdeps/unix/sysv/linux/m68k/sys/ucontext.h (NGREG): Rename to
__NGREG and define NGREG to __NGREG if [__USE_MISC].
(gregset_t): Define using __NGREG.
(__ctx): New macro.
(fpregset_t): Use __ctx in defining fields.
(mcontext_t): Likewise.
* sysdeps/unix/sysv/linux/mips/sys/ucontext.h (NGREG): Rename to
__NGREG and define NGREG to __NGREG if [__USE_MISC].
(NFPREG): Rename to __NFPREG and define NFPREG to __NFPREG if
[__USE_MISC].
(gregset_t): Define using __NGREG.
(__ctx): New macro.
(fpregset_t): Use __ctx in defining fields.
(mcontext_t): Likewise.
* sysdeps/unix/sysv/linux/nios2/sys/ucontext.h (__ctx): New macro.
(mcontext_t): Use __ctx in defining fields.
* sysdeps/unix/sysv/linux/powerpc/sys/ucontext.h (__ctx): New
macro.
[__WORDSIZE == 32] (NGREG): Rename to __NGREG and define NGREG to
__NGREG if [__USE_MISC].
[__WORDSIZE == 32] (gregset_t): Define using __NGREG.
[__WORDSIZE == 32] (fpregset_t): Use __ctx in defining fields.
(mcontext_t): Likewise.
[__WORDSIZE != 32] (NGREG): Rename to __NGREG and define NGREG to
__NGREG if [__USE_MISC].
[__WORDSIZE != 32] (NFPREG): Rename to __NFPREG and define NFPREG
to __NFPREG if [__USE_MISC].
[__WORDSIZE != 32] (NVRREG): Rename to __NVRREG and define NVRREG
to __NVRREG if [__USE_MISC].
[__WORDSIZE != 32] (gregset_t): Define using __NGREG.
[__WORDSIZE != 32] (fpregset_t): Define using __NFPREG.
[__WORDSIZE != 32] (vscr_t): Use __ctx in defining fields.
[__WORDSIZE != 32] (vrregset_t): Likewise.
[__WORDSIZE != 32] (mcontext_t): Likewise.
* sysdeps/unix/sysv/linux/s390/sys/ucontext.h (__ctx): New macro.
(__psw_t): Use __ctx in defining fields.
(NGREG): Rename to __NGREG and define NGREG to __NGREG if
[__USE_MISC].
(gregset_t): Define using __NGREG.
(fpreg_t): Use __ctx in defining fields.
(fpregset_t): Likewise.
(mcontext_t): Likewise.
* sysdeps/unix/sysv/linux/sh/sys/ucontext.h (NGREG): Rename to
__NGREG and define NGREG to __NGREG if [__USE_MISC].
(gregset_t): Define using __NGREG.
(NFPREG): Rename to __NFPREG and define NFPREG to __NFPREG if
[__USE_MISC].
(fpregset_t): Define using __NFPREG.
(__ctx): New macro.
(mcontext_t): Use __ctx in defining fields.
* sysdeps/unix/sysv/linux/x86/sys/ucontext.h (__ctx): New macro.
[__x86_64__] (NGREG): Rename to __NGREG and define NGREG to
__NGREG if [__USE_MISC].
[__x86_64__] (gregset_t): Define using __NGREG.
[__x86_64__] (struct _libc_fpxreg): Use __ctx in defining fields.
[__x86_64__] (struct _libc_fpstate): Likewise.
[__x86_64__] (mcontext_t): Likewise.
[!__x86_64__] (NGREG): Rename to __NGREG and define NGREG to
__NGREG if [__USE_MISC].
[!__x86_64__] (gregset_t): Define using __NGREG.
[!__x86_64__] (struct _libc_fpreg): Use __ctx in defining fields.
[!__x86_64__] (struct _libc_fpstate): Likewise.
[!__x86_64__] (mcontext_t): Likewise.
> Being able to generate the ChangeLog file is I think important for
> posterity, tarball releases lack any kind of history. History has a
> really bad memory, just because one uses a VCS today doesn't mean that
> this will be available in 10, 20, 30 years in any usable format, or it
> might vanish completley.
Well, you could add a requirement not to switch away from a distributed
VCS or to switch to a different VCS without converting history. And
indeed one to have the repository present or mirrored on GNU servers, if
desired (or to have release tarball versions that include the VCS history,
etc.).
> Keep a change log to describe all the changes made to program
> source files. The purpose of this is so that people
> investigating bugs in the future will know about the changes that
> might have introduced the bug. Often a new bug can be found by
> looking at what was recently changed. More importantly, change
> logs can help you eliminate conceptual inconsistencies between
> different parts of a program, by giving you a history of how the
> conflicting concepts arose and who they came from.
>
> All this information is available in version control.
>
> If you put ChangeLog entries in the commit message, then yes this
> information will be available. But if you discard ChangeLog entries
> completley, I do not see how it can be available. "annotate", "diff"
> don't provide a human readable and searchable means to go through
> history. The information also becomes totally lost as soon as you
> discard the VCS (i.e. when doing releases).
You know about the changes much more reliably from the VCS than from
ChangeLog entries, given that people may forget to write the ChangeLog
entry, or may miss out a file or function's changes from it, or may commit
with a ChangeLog from a previous version of the patch that doesn't
correspond accurately to the committed patch version (given the make-work
nature of writing most ChangeLog entries, and given they are something not
generally used outside the GNU project, updating them is often something
people don't think of doing - again, watching for badly updated ChangeLog
entries in patch review is both necessary at present, and essentially a
waste of time). You can use e.g. "git log -p --stat" and search for file
or function names (function names mentioned automatically on the @@ line
of diff context are going to be at least as accurate as those in ChangeLog
entries, given they are probably what people use when writing their
ChangeLog entries to identify the functions changed). The precise details
may differ, but you have much more flexibility when looking at the actual
history than something written at a very specific level (too high to
actually undo the changes, too low to readily get an overall understanding
of a complicated change) for ChangeLogs.
> Because the problem with ChangeLogs, as seen in glibc and
> elsewhere, is with needing to write descriptions in a particular
> format, at a level that is not useful for human understanding of
> the changes while not being as detailed as the exact changes
> themselves in version control, being able to generate ChangeLogs
> from version control using suitably-formatted log messages does not
> address the issue.
>
> Are we talking about the entries, or the actual ChangeLog file? Many
> projects have abandoned keeping actual ChangeLog files, and extracting
> this information when making a tarball release since they cause the
> typical merge conflics and what nots. If you are refering to the
> ChangeLog entries, I am not sure what problems you are refering to.
The problem that enumerating individual named entities changed consumes
the time of contributors, confuses and puts off people used to non-GNU
free software which invariably does not use this particular pre-VCS form
of describing changes, and results in long unhelpful descriptions which
don't allow you to see the wood for the trees because of the focus on a
particular low-level repetition of what the change itself is for each
individual entity, as can be seen in the VCS, rather than what the change
is as a logical whole.
> This form of description is exactly what's the problem. In the
> presence of ubiquitous distributed version control, writing this
> style of description is the equivalent of:
>
> /* Add 1 to i. */
> i++;
>
> (that is, just repeating the immediately obvious meaning of the
> history that everyone can see, and so effectively serving to hide
> what's actually interesting about the history at a human level and
> *should* be described).
>
> I don't think the comparison is fair, the point of the ChangeLog files
> is to be able to undo changes. The comment above doesn't actually
The point of the VCS is to be able to undo changes. ChangeLog files, and
the form of change description therein, are in no way a substitute for the
VCS, and are essentially obsoleted by it.
> provide anything, a more apt comparison would have been:
>
> /* Change #1 was: Add pi to i. */
> /* Change #2 was: Add 1 to i. */
> i += 2;
No, my assertion is that "Add 1 to i." is to "i++;" as the above long
ChangeLog entry is to the actual commit involved - a repetition of what
everyone can plainly see from looking at the thing described (a C
statement in the first case, a commit in the git history of glibc in the
second case), and so completely useless.
Instead of "Add 1 to i." you should describe logical blocks of code at the
logical level with things that aren't immediately repeating the obvious
semantics of the code. And, likewise, the actual commit message
Fix more namespace issues in sys/ucontext.h (bug 21457).
Continuing the fixes for namespace issues in sys/ucontext.h, this
patch moves various symbols into the implementation namespace in the
absence of __USE_MISC. As with previous changes, it is nonexhaustive,
just covering more straightforward cases.
Structure fields are generally changed to have a prefix __ in the
absence of __USE_MISC, via a macro __ctx (used without a space before
the open parenthesis, since the result is a single identifier).
Various macros such as NGREG also have leading __ added. No changes
are made to structure tags (and thus to C++ name mangling), except
that in the (unused) file sysdeps/i386/sys/ucontext.h, structures
defined inside other structures as the type for a field have their
tags removed in the non-__USE_MISC case (those structure tags would
not in any case have been visible in C++, because in C++ the scope of
such a tag is limited to the containing structure). No changes are
made to the contents of bits/sigcontext.h, or to whether it is
included. Because of remaining namespace issues, this patch does not
yet fix the bug or allow any XFAILs to be removed.
describes the logical nature of the change (including what is *not*
changed, where relevant, which ChangeLog files would never mention), at
the appropriate level for people to understand it. I think people should
be writing commit logs at that level rather than spending time duplicating
the VCS information on exactly which symbols were changed in which files.
> That information is very useful when digging for bugs, and
> understanding how a code base was changed. Just because one uses VCS
> doesn't mean that history is automatically available to everyone,
> someone still needs to write a commit message of some sort (i.e. the
> ChangeLog entry)
I.e. the sort of message above that you use to justify and explain the
change at the logical level rather than enumerating files and symbols
therein.
I'm all for proper detailed commit messages explaining both the content
and the purpose of the change at the logical level, as used by the Linux
kernel and by git itself. It's the descriptions at the per-file,
per-function level in the ChangeLog format that I consider unhelpful when
they duplicate VCS information, badly. I think GNU should be encouraging
the sort of commit messages used by the Linux kernel and git, i.e. the
sort of patch description you'd put in a mailing list message proposing
and explaining the patch, while leaving the VCS to show what files and
bits of files were changed, how, for those interested in that information.
> Sifting through multi-megabyte diffs isn't very fun when trying to get
> a birds eye view of what actually happened in a code base, and this is
> where ChangeLog entries are super useful and I'd argue totally
> nessecary for any code base.
I don't think so. If someone wants to understand what changed between
glibc 2.25 and 2.26 in more detail than the NEWS file gives, they might
look at the above sort of description in the commit log; it will be much
more helpful to them, and give much more insight into glibc development,
than over 10000 lines of ChangeLog entries enumerating files and symbols.
If they want to see the files changed, git log --stat. If they want to
see deeper into particular changes, git log -p --stat and look at
whichever changes are of interest.
--
Joseph S. Myers
address@hidden
- Circumstances in which ChangeLog format is no longer useful, Joseph Myers, 2017/07/28
- Re: Circumstances in which ChangeLog format is no longer useful, Alfred M. Szmidt, 2017/07/28
- Re: Circumstances in which ChangeLog format is no longer useful,
Joseph Myers <=
- Re: Circumstances in which ChangeLog format is no longer useful, Alfred M. Szmidt, 2017/07/28
- Re: Circumstances in which ChangeLog format is no longer useful, John Darrington, 2017/07/29
- Re: Circumstances in which ChangeLog format is no longer useful, Alfred M. Szmidt, 2017/07/29
- Re: Circumstances in which ChangeLog format is no longer useful, Paul Smith, 2017/07/29
- Re: Circumstances in which ChangeLog format is no longer useful, John Darrington, 2017/07/29
- Re: Circumstances in which ChangeLog format is no longer useful, Rical Jasan, 2017/07/29
- Re: Circumstances in which ChangeLog format is no longer useful, Joseph Myers, 2017/07/31
- Re: Circumstances in which ChangeLog format is no longer useful, Joseph Myers, 2017/07/31
- Re: Circumstances in which ChangeLog format is no longer useful, Joseph Myers, 2017/07/31