Re: [lmi] Generating pages with tables in the new PDF generation code

lmi
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lmi] Generating pages with tables in the new PDF generation code

From:	Greg Chicares
Subject:	Re: [lmi] Generating pages with tables in the new PDF generation code
Date:	Sun, 27 Aug 2017 23:37:25 +0000
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1
On 2017-08-27 17:43, Vadim Zeitlin wrote:
> On Sun, 27 Aug 2017 16:13:53 +0000 Greg Chicares <address@hidden> wrote:
[...]
> GC> > GC> I did this:
> GC> > GC> 
> GC> > GC> mkdir --parents /opt/lmi/pdf/
> GC> > GC> cd /opt/lmi/pdf/
> GC> > GC> git clone https://github.com/vadz/lmi.git --branch direct-pdf-gen 
> --single-branch
> GC> > GC> cd lmi
> GC> > 
> GC> >  I know you're allergic to branches, but this is really, really not the
> GC> > optimal way to do this.
> ..
> GC> Whether it's optimal depends on the goal. If the goal is simply
> GC> to get this merged, then an automated merge command is optimal.
> 
>  I think it's optimal in any case. Reviewing changes after the merge is
> better than reviewing them before and automation takes care of trivial
> things (like the already mentioned example of modifying a file in master)
> allowing you to concentrate on the real changes.

I fear we're talking past each other. Perhaps we first need to
establish a definition of 'merge'.

To me, 'merge' means integration into the production system, which
is HEAD: the tip of master, which is the one and only branch.
Almost no proposed change goes directly into production, no matter
who the author may be. If I'm the author, then I make changes in a
local working copy, where they are reworked again and again until
they are ready to commit; at that point, I mv them into a separate
../stash directory, then do
  git ls-files --deleted | xargs git checkout --
to start with an unmodified copy of HEAD, and then create commits
that put the changes into production. That final "create commits"
step may sound like repetitive busywork, but it's actually a review
stage where many minor defects are removed. To me, 'merging' is the
final 'git push' that culminates this review-revise-rework process.

That definition of 'merge' precludes reviewing changes after the
merge. It sounds like putting untested changes into production.
That's where I'm stuck. Our definitions of 'merge' must differ.

> GC> My goal is to conserve stability while evolving, by reviewing
> GC> and understanding each change before committing it. The actual
> GC> act of committing is trivially mechanical, and might take one
> GC> percent of the total time, so it doesn't need to be optimized.
> GC> What I do want to optimize is the thinking process, and in my
> GC> experience the best way is to break a large whole into more
> GC> tractable pieces that I can integrate one at a time with
> GC> comprehensive testing.
> 
>  For me the tractable pieces are the individual commits, but it's hardly
> feasible to replay all of them: there are 88 of them right now and there
> will be many more before the merge. Are you really going down to this level
> of granularity? In addition to being more work for you, this will mean
> reviewing initial version of the code, not using external templates, which
> is just not there any more and I don't see how can it be useful for you

I certainly don't want to examine a series of individual commits.
I do want to scrutinize every change that goes into production.
Often I'll want to make modifications of my own....

> (but keeping it in history can still be useful for me, and is definitely
> simpler than whitewashing history to get rid of it).

...for example, consider these two consecutive commits:
  commit 377933616a2fca2844bffdfa7a1ff0b5ada5e974
  commit e46036bb63911e7e85d63e58ef372200b8f68e2f
When I first glanced at 3779336, I quickly put it aside, hoping
that I could find a way to exclude bool based on its type rather
than rewriting the test so that type bool would seem to pass:

-    if(T(1) < maxT)
+    if(!std::is_same<bool,typename std::remove_cv<T>::type>::value)

Once I had gcc-6.x installed myself, I revisited this and
struggled with that issue. In the end, I applied your change
with 'git am' to preserve the (single-commit) history in your
working copy, and then immediately applied my revision.

We're looking for a way to scale that process up to a much
larger and more complex changeset. I don't want to erase your
history, and I don't want to give up the ability to make some
modifications to the code before it goes into production.

The greatest difficulty I foresee is a revision to existing
code that I might want to rewrite radically before accepting
it into production. Therefore, it seems best to identify such
obstacles early. That's the reason why I grabbed a branch
from your repository and checked out a throwaway working copy,
and that's why I mentioned this [quoted out of order]:

> GC> And it would take considerable
> GC> analysis for me to get comfortable with the change to 'ledger.hpp':
> 
>  The change to the .hpp file is quite simple, but you probably mean the
> replacement of ledger_xml_io.cpp with ledger_evaluator.cpp.

Sure, the number of characters changed in 'ledger.hpp' itself is tiny,
but the consequences may be profound. And class Ledger is so centrally
important that any change to it calls for intense scrutiny. The six
removed member functions look like they ought to be virtual, or at
least the first four of them. If they're actually virtual, did I just
fail to mark them as such? and is there a base class that needs to be
removed? If not, then why didn't I document the reason why they're not
virtual? I can probably answer those immediate questions easily enough,
but other questions are sure to arise. I'll want to do this:
  grep xml_root_name *.hpp
and perhaps document why that member is virtual in ten classes but
nonvirtual in five. And this amounts only to asking myself whether I've
given you a solid foundation to build on, before even looking at what
you actually propose to change.

> If so, this
> change is simpler than it looks and, again, "git diff" should help with
> reviewing it as it should realize that the file was just renamed.

Okay, I just hadn't gotten that far. It looks like 'ledger_evaluator'
serves to encyst the putrid mess that was 'ledger_xml_io.cpp', and
once I understand it better I might really like that change,
especially since make_evaluator() uses move semantics instead of
returning a pointer that could be null.

>  Perhaps it would be better to break just the final version of the changes
> into pieces? The good news is that absolutely nothing prevents you from
> doing this after letting "git merge" take care of all the trivial stuff.

At this point I think I need to ask for your definition of 'merge',
which seems to be a mere matter of git mechanics having nothing to
do with code review. Do you mean:
  (a) run some git command to add your stuff as a branch, thereby
      importing all its history; then
  (b) actually review and potentially revise files piecemeal,
      incorporating the final result into HEAD of 'master'?
I'm not exactly excited about creating a branch, because the concept
is foreign to me, but I'm not really opposed to it either, as long as
it's not going to require me to spend lots of time studying git or
punish me for my willful ignorance of git branching.

> GC> Eradicating history is not a goal: it's an anti-goal, which we should
> GC> avoid if we can do so without undue convenience. For example, if you
> GC> can write a simple command that would import all of the new files at
> GC> once into lmi HEAD, preserving their history,
> 
>  This is possible, but not very simple because it would mean splitting each
> commit in my branch into two parts: one that affects the existing files and
> the other one which affects only the new one. Worse, it would be quite
> counter-productive because it would break the commits atomicity. I really
> don't think we should do such a thing.

Okay, understood.

> GC> OTOH, 64 hunks changed in 'group_quote_pdf_gen_wx.cpp', and we can't
> GC> integrate that into HEAD without careful review and testing, lest we
> GC> destabilize the production system.
> 
>  BTW, here is how I would review these changes:
> 
>       $ git log -p ..direct-pdf-gen -- group_quote_pdf_gen_wx.cpp
> 
> This command would show all the commits in the branch affecting this file
> with the changes done by each commit to this file (only), due to -p
> (--patch) option. Notice that this is not something you can do without git,
> if you just diff the 2 versions of this file, you get a diff with
> 185 insertions and 371 deletions, i.e. a huge number of changes which are
> not easy to understand. Individual commits shown by the command above are
> much simpler.

That doesn't quite work here, probably because of '--single-branch' below,
so I'll copy and paste everything I've done because it's so short:

/home/greg[0]$schroot --chroot=cross-lmi
/home/greg[0]$ls -di /
19005443 /
/home/greg[0]$cd /opt/lmi/
/opt/lmi[0]$mkdir pdf
/opt/lmi[0]$cd pdf
/opt/lmi/pdf[0]$git clone https://github.com/vadz/lmi.git --branch 
direct-pdf-gen --single-branch
/opt/lmi/pdf[0]$cd lmi
/opt/lmi/pdf/lmi[0]$rm *vcx* *.sln *.bkl .gitignore
/opt/lmi/pdf/lmi[0]$git log -p ..direct-pdf-gen -- group_quote_pdf_gen_wx.cpp 
|wc
      0       0       0

Is it obvious to you what I should change to make this work? I'd
guess that
- git clone https://github.com/vadz/lmi.git --branch direct-pdf-gen 
--single-branch
+ git clone https://github.com/vadz/lmi.git
should suffice, but that's just a guess.

> GC> At any rate, the two strategies I see are:
> GC> 
> GC> (1) Let all PDF development continue to its conclusion in your
> GC> personal repository, then test that thoroughly, and finally merge
> GC> it all at once; or
> GC> 
> GC> (2) Incrementally adopt changes from your personal repository into
> GC> lmi HEAD, proceeding one step at a time until no step remains.
> GC> 
> GC> Of those two, isn't the second clearly preferable?
> 
>  Yes, the second one is preferable, but this would require more of your
> time because you might have to review changes that I later replace or undo.
> If you're ready to take the risk of this happening, let's do it like this,
> but please let me create a new branch, based on master, instead of the
> current one, to separate the unrelated changes.

Okay. Again, there are (a) new files that cannot affect production, and
(b) changes to old files, which demand the most intense scrutiny because
the Prime Directive is to avoid destabilizing the production system,
and review of (a) is the likeliest logjam, so that's where I think I
should start, and soon. It would be most helpful if you could indicate
which files are least likely to require any later changes. For example,
if the 'ledger_evaluator' changes above are unlikely to require any
further work, then I might start with them. Alternatively...

>  But I'm still not sure how exactly would you like to proceed. For example,
> you can't merge ledger.hpp change right now because it would, of course,
> break the existing FOP-using code. You could integrate the changes to
> group_quote_pdf_gen_wx.cpp and I can make a separate branch just with them.
> But what about all the rest?

I can't propose a concrete plan without surveying the changes first.
Maybe the 'group_quote_pdf_gen_wx.cpp' changes are independent of
the rest--I just don't know, but perhaps you can provide guidance.
I'm not sure a separate branch for a subset of changes is needed:
a list of files might do as well, once we arrive at a common
definition of 'merge'.
[Prev in Thread]
Current Thread
[Next in Thread]
Re: [lmi] Generating pages with tables in the new PDF generation code, (continued)
Prev by Date: Re: [lmi] Generating pages with tables in the new PDF generation code
Next by Date: Re: [lmi] Generating pages with tables in the new PDF generation code
Previous by thread: Re: [lmi] Generating pages with tables in the new PDF generation code
Next by thread: Re: [lmi] Generating pages with tables in the new PDF generation code
Index(es):
- Date
- Thread