[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lmi] Limits on pastable census size
From: |
Vadim Zeitlin |
Subject: |
Re: [lmi] Limits on pastable census size |
Date: |
Sat, 10 Feb 2018 19:53:04 +0100 |
On Fri, 9 Feb 2018 19:27:33 +0000 Greg Chicares <address@hidden> wrote:
GC> On 2018-02-09 15:19, Vadim Zeitlin wrote:
GC> > On Fri, 9 Feb 2018 14:06:14 +0000 Greg Chicares <address@hidden> wrote:
GC> >
GC> > GC> Yesterday we had to work with a census with over ten thousand cells.
GC> > GC> (We've seen one that large only once before.) It seems to work now,
GC> > GC> but it was difficult to create with "Census | Paste census", which
GC> > GC> ran out of memory (it threw bad_alloc).
GC> >
GC> > This is surprising. In the current 32 bit build, sizeof(Input) is "just"
GC> > 8920 (BTW, with MSVC it's 8248, so it looks like some space could be saved
GC> > just by using tighter packing), so allocating enough space for 10000
GC> > objects of this type requires about 85MiB of RAM, which shouldn't create
GC> > any trouble for a system capable of running any contemporary software
GC> > (looking at Firefox and 2.5GiB in its working set column on my system
right
GC> > now...). So either the machine this was tested on was really underpowered
GC> > or its RAM was heavily fragmented for some reason. BTW, what version of
MSW
GC> > and of which bitness did this happen under?
GC>
GC> It failed on these three machines:
GC>
GC> Kim's machine: 64-bit msw-seven, eight GB RAM (7.88 GB "usable")
GC> user's machine: 64-bit msw-ten (that's all we know)
GC> my machine: 64-bit debian chroot, 64GB RAM, 64- and 32-bit wine:
I still have no explanation to the last one, as it did work fine on my
machine which has only 32GiB of RAM and it only consumed ~1.5GiB of it at
most -- but this could still be enough to fail on a machine with 8GiB of
RAM in total.
GC> I can't share that exact file because it contains personal client data,
GC> but it's easy to fabricate an equivalent testcase. First of all, does
GC> the example here:
GC> http://www.nongnu.org/lmi/pasting_to_a_census.html
GC> work for you? (It is my impression that the blank line below the headers
GC> must be removed.)
Yes, it does work and removing the blank line was not necessary.
GC> Using your tool of choice, copy the last line of that
GC> until you have ten thousand nonblank lines, then copy them all to the
GC> clipboard and paste into lmi, thus:
GC> File | New | Census
GC> Census | Paste
GC> I see:
GC> [statusbar] Added cell number 9999.
GC> [messagebox] Error std::bad_alloc
I don't see this, but I do see much more memory being allocated than I
would have naïvely expected. This is at least partly due to the fact the
sizeof(Input) was a big underestimation of the memory consumed by each
cell, because Input class also allocates a lot of it dynamically. I don't
know if it's easily possible to change this (it looks like it ought to be,
because I just don't see enough unique data to take so much space in each
cell and so am pretty sure that it could be changed by sharing, if
necessarily with COW, data between cells, but whether this is really easy
is another question), but I do think that the copy at the end should be
replaced by move because there just doesn't seem to be any need whatsoever
to pessimize this by allocating so much memory at the end, after already
doing all the work. In fact, I wonder why is "cells" temporary vector is
needed at all and why couldn't we just append cells to cell_parms()
directly?
Please let me know if you'd like me to optimize this and, if you would,
how far should I go on the scale of "1 - do the absolute minimal local
changes in CensusView::UponPasteCensus() helping with memory usage" to
"10 - rewrite Input class entirely to be more memory-efficient".
Thanks,
VZ