[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lmi] an xml schema for (single|multiple)_cell_document file XML for
From: |
Vadim Zeitlin |
Subject: |
Re: [lmi] an xml schema for (single|multiple)_cell_document file XML format |
Date: |
Mon, 27 Feb 2012 16:32:22 +0100 |
On Mon, 27 Feb 2012 12:44:46 +0000 Greg Chicares <address@hidden> wrote:
GC> Done 20120220T0158Z, revision 5402:
GC> http://svn.savannah.nongnu.org/viewvc?view=rev&root=lmi&revision=5402
GC>
GC> That exercise was unexpectedly interesting. I started with a simple
GC> "use enclosing elements" change, essentially as described here:
GC> http://lists.nongnu.org/archive/html/lmi/2010-08/msg00015.html
GC> That made loading a file too slow: I could feel it plainly even before
GC> I measured it. The counter displayed on the statusbar paused noticeably
GC> after loading about 32 cells, then about 64, then about 128--whereas it
GC> incremented smoothly for the old file format. It turns out that knowing
GC> the size in advance lets us call std::vector::reserve() so that the
GC> initial capacity is sufficient and expensive reallocations are avoided.
It's, of course, always better to preallocate memory, but I had no idea
that reallocations could be so expensive that you would be visually able to
notice this. It looks like there might be something else wrong here, e.g.
maybe the copy ctor of the elements of this vector is particularly
inefficient?
GC> The final code in the repository is as fast and smooth as the original
GC> because it writes the enclosing elements with a size attribute, e.g.:
GC> <particular_cells size_hint="180">
GC> and reserves the hinted number of elements before reading them. That
GC> attribute is optional; omitting it affects speed, but not correctness.
FWIW I don't like this approach very much. The information about the
number of cells is already in the file, why do we need to keep a separate
hint about it? Couldn't we just count the cells first, before processing
them?
Regards,
VZ