info-cvs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Developer branches


From: Paul Sander
Subject: Re: Developer branches
Date: Sat, 2 Feb 2002 11:52:39 -0800

>--- Forwarded mail from address@hidden

>Thanks for the reply.  I performed the search you mentioned and found the
>following message
> http://ccvs.cvshome.org/servlets/ReadMsg?msgId=5846&listName=info   ).
>However, that message also sugessts searching on "submit/assemble" which
>seems to imply I did not find the one you intended.

>[...]

>What was the jist of the "hand-off" process you were mentioning in the above
>archived message?

>--- End of forwarded message from address@hidden


There have actually been quite a number of discussions about it over the
years.  Here are a couple of messages that convey most of the substance.

+++++++++++++++

Date: Fri, 1 Nov 96 21:07:39 PST
From: address@hidden (Paul Sander)
Message-Id: <address@hidden>

>How do most CVS users control what goes into each official build?  The
>usual approach is that the developers somehow indicate which files they
>want to go into the build, and then the build engineer uses that info to
>check out the hierarchy, compile, and tag the revisions of all files
>used in that build.

The method I implemented at a previous employer used a baseline+delta
mechanism to compute a bill of materials that drove the checkout.  The
build's predecessor's bill of material was used as a starting point.
Developers provided a list of deltas (filenames plus revision numbers,
or filenames plus deletion flags, which were collected automatically by
a tool that I provided).  The order of submissions of deltas was retained.
The deltas then amended the bill of materials and the result was checked out.
If the build broke, a delta was removed from the list, the bill of materials
was recomputed, and the build was repeated.  (There are several algorithms
for selecting which delta(s) to remove in the event of a failure, so we were
able to automate the process.)

>It seems like this could be done with RCS states or symbolic names.  Is
>there a standard procedure or does everyone do it differently?  Are
>there some good models I could follow?

>Also, if you use symbolic names and do weekly builds, do you end up with
>extremely long log info?  Do you occasionally clean these up?

Tagging all of the sources can lead to long info eventually.  In the
method above, I put the bills of materials and delta lists under version
control and tagged them at the end of the build.  This simplified and sped
up a number of things:

- Compute the differences between builds and produce changelogs from the
  developers' commit comments.
- Get documentation for new features that were stored as comments with the
  deltas.
- Subsequent checkouts of previous releases sped up because the bill of
  materials could drive RCS directly, saving CVS' overhead.  (This is safe
  under Unix due to the way that RCS manipulates its files and how the Unix
  filesystem works.  It had no impact on the integrity of the repository
  even during concurrent commits.)
- It was easy to identify all of the tags for any module without scanning the
  entire repository.
- It provided an effective workaround for the module database non-versioning
  problem.

+++++++++++++++

Date: Tue, 5 Nov 96 16:01:48 PST
From: address@hidden (Paul Sander)
Message-Id: <address@hidden>

> Is the work you described below in the public domain ?. If so could you
>point me to it ?.

I'm afraid it's not.  The tools contained a lot of proprietary stuff that
had to do with internal security, integration with the local defect tracking
system, etc.  I doubt it would be that useful even if I could give it to
you.

However, I can discuss the details of the relevant algorithms.  Below is a
script that creates a bill of materials for a checked out work area.  It
scans the CVS (1.3) state and writes a list of entries for each file under
CVS control.  Each entry is a single line with three fields, separated by
tabs.  The entries include a path (relative to $CVSROOT) to the RCS file,
a path (relative to the root of the work area) to the working file, and
the version number.  The whole process revolves around lists like this.
By the way, the RCS file paths should be unique; this is important if you
wish to track changes made in the modules database.

The handoff process is a two-step procedure.  The first step (called "submit")
is done by the developers in which they generate a delta, which is little more
than a bill of materials for a few files.  The second step (called "gather")
is done by the integrators in which the deltas are applied to the predecessor's
bill of materials, or the order of their submission.  The result is a
comprehensive list of the desired sources for the new build.

The basic algorithm for the submit step is to run a script like the one
below in a work area.  The output of the script is given a serial number,
and stored in a secure area.  Embellishments include allowing the
developer to list specific files to include in the delta, to make
recursive descents optional, verify that the source files are committed,
support commentary, record the submission in the defect tracking system,
and so on.  If the deltas are collected in a bill of materials that is
stored in the secure area, then you can also verify that old revisions
of files never replace newer ones and support deletions.  My implementation
supported all of these.  Some other desirable features would be to allow
developers to query their submissions and to back them out before they are
gathered.  One other possible feature is to minimize the deltas, but this
is of questionable value in combination with an optional gather feature.

The gather step reads the predecessor's bill of materials into an associative
array, keyed by RCS file paths.  It then reads each delta, replacing files
in that associative array with entries from each delta.  A file containing
flags for each delta should be maintained.  Each delta is marked as either
"integrated" or "removed".  While gathering the deltas, consult the file.
If a delta is marked as "integrated" or is unmarked, mark it "integrated".
Skip any delta marked as "removed".  The contents of the associative array
after this procedure is the new bill of materials.  If you support file
deletion, then the entry for each file is removed from the associated array
at the time the delta is incorporated.  To delete or re-add a delta, edit the
delta flag file accordingly and re-run the gather step.

After the gather step completes, scan the bill of materials and check out
each file.  The checkout can be optimized in a couple of ways.  First, if
the gather step is run more than once, compare the last two outputs and
check out only the differences.  Second, you can bypass CVS by using
RCS directly and writing your own CVS state.  (This last optimization
speeds the checkout by 50% or more if CVS is run in local mode.)

Another optional feature of the gather step is to perform a crude
dependency analysis of the deltas, and remove any delta that depends on
an earlier removed delta.  You do this by remembering the files contained
in the removed deltas.  If any subsequent delta contains any of those files,
mark it as "removed" and add all of the new delta's files to the list.  You
can also support forced inclusion by adding a new "forced" flag to the delta
flag file to indicate that a delta was included despite the dependency
analysis.  Files contained in the forced delta are removed from the list and
the gather continues.  This feature in combination with minimal deltas
may cause problems because dependencies that developer wished to communicate
are lost, leading to more failed builds after deltas are removed.

> If the build broke, a delta was removed from the list, the bill of materials
> was recomputed, and the build was repeated.  (There are several algorithms
> for selecting which delta(s) to remove in the event of a failure, so we were
> able to automate the process.)

Here are a couple of algorithms:

- Remove all deltas.  Loop:  Add the "next" delta, ending when there are
  no more to consider.  Rebuild.  If the build fails, remove the latest
  delta (i.e. the one just added).
- Scan the output of the build to determine the specific compilation that
  failed.  Scan the Makefile for all of its source files.  Remove the latest
  delta that contains any of them.  Repeat until the build succeeds.

Note that in the worst case, you end up with the predecessor build.
Assuming it succeeded, you always get a successful build.

> Tagging all of the sources can lead to long info eventually.  In the
> method above, I put the bills of materials and delta lists under version
> control and tagged them at the end of the build.  This simplified and sped
> up a number of things:
> 
> - Compute the differences between builds and produce changelogs from the
>   developers' commit comments.

Differences are discovered by running diff against two bills of materials.
Some of the building blocks for such a tool are on my Web site at
<URL: http://www.sander.cupertino.ca.us/source.html>.  I can give you
one of my tools that do this, but it requires a special patch to the
Gnu Sort program.  (This patch sorts Unix paths in such a way that all
of the files contained in any directory are listed together.  Diff's
existing implementation offers no option to do that.)

> - Get documentation for new features that were stored as comments with the
>   deltas.

This is a simple matter of scanning the deltas.  I kept them under version
control along with the bill of materials.

> - It was easy to identify all of the tags for any module without scanning the
>   entire repository.

Tag the bill of materials with the same tag given the module.  Then do a
"cvs log" of the bill of materials.

The basic algorithms are pretty simple.  Securing the delta lists can
be tricky, and some of the optimizations take two or three iterations
to get right.  But if you're serious about using tools like these, it
should be possible to produce the basic capabilities in a week or so.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]