lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Correctness and performance of varying column width mode in th


From: Greg Chicares
Subject: Re: [lmi] Correctness and performance of varying column width mode in the census view
Date: Thu, 28 May 2020 22:02:04 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0

On 2020-05-28 17:55, Vadim Zeitlin wrote:
> On Thu, 28 May 2020 13:50:21 +0000 Greg Chicares <gchicares@sbcglobal.net> 
> wrote:
> 
> GC> On 2020-05-27 18:32, Vadim Zeitlin wrote:
> GC> > On Wed, 27 May 2020 16:56:46 +0000 Greg Chicares 
> <gchicares@sbcglobal.net> wrote:
> [...]
> GC> >  Currently it is indeed far too costly, but there is no intrinsic reason
> GC> > for it to be so, at least not when changing a single cell, because all 
> we
> GC> > need to do is to measure the length of a single string and increase the
> GC> > column width if the new string doesn't fit into it.
> GC> 
> GC> My impression is that spreadsheet programs might do that sometimes,
> GC> but certainly not always--e.g., they might adjust the width of only
> GC> the current column when a new cell value is typed in, but not when
> GC> a cell or a block of cells is pasted. In general, though, they seem
> GC> not to auto-re-size.
> 
>  Yes, indeed. FWIW Microsoft Excel (which is the de facto standard for the
> spreadsheet, like it or not, and the one all the others emulate)

Okay, let's examine its behavior in detail. I'm using the very latest
version of 'excel'; it's so postmodern that I can't figure out how to
display what version it is, as there is no "Help | About".

Enter "123,456" in a blank column (it fits), then move down one row
and enter "123,456,789": the column widens to fit.

Move down a row, then enter "1111111111111111": you get "1.11111E+15",
and the column widens slightly again so that fits. AFAICS, it wants to
show six significant digits for scientific notation, and it'll widen the
column to fit that much precision.

Move down a row, then enter "11111111111111111111111111111111": you get
"1.11111E+31", and that fits, so the width stays the same.

All of that is fine, so far. It's fine only because there's no noticeable
delay for column resizing.

Move to the "123,456,789" cell above; copy it; move to a blank column;
paste it. Result: "########", with column width unchanged. That may be
the "standard" behavior, but I have always found it inconvenient: what
good is a GUI that doesn't show you the data? I almost expect 'excel'
to prompt me to change the width. It would be better if it changed the
width for me automatically. I suppose they're thinking of a complicated
scenario like pasting a large block of cells, in circumstances where
autosizing would be even worse than replacing all data with '#'; but I
can't imagine such a scenario.

And that's all that happens for numbers, AFAICS. For lmi, duplicating
the behaviors above would be just fine. It would also be fine to widen
a column to fit a pasted single-cell value--in fact, I think that would
be much better, even if 'excel' disagrees, because I find their behavior
jarring. Maybe the lmi case is simpler in that only single-cell values
are pastable within the grid (census paste from spreadsheet is an exotic
special case). Even for census paste, I'd say it's best to adjust the
column widths to fit all the data, provided that it's fast enough.

Then we come to strings. Strings are different, because they are of
potentially limitless length....

> wraps
> longer strings by making the cell containing them taller if possible. I

That's yicky IMO.

> wonder if we should consider doing this too?

IIRC, lmi's census manager has absolutely uniform fixed row height, and
I think that's appropriate.

Let's test it in 'excel'. In some row of a blank column, I type:
  Once upon a midnight dreary, while I pondered, weak and weary
and the cell's height and width don't change. I can see the whole
line, because there are empty cells to its right. (Yet there are no
empty cells in lmi's census manager.)

Now I move to the same row of the next column and type:
  Over many a quaint and curious volume of forgotten lore
and, again, no cell's height or width changes. The new cell's text
is all visible because it sprawls into empty space to its right.
The "Once upon" cell is truncated.

In these experiments, row height never changes. Therefore, maintaining
a uniform fixed row height is "standard", and lmi's behavior in that
regard cannot be faulted.

I've observed the goofy behavior you mention:

  make the
  cell
  taller so
  that
  everything
  fits, like
  columns in
  a
  newspaper
  with poor
  typography

in some unremembered circumstance, but it's wrong and horrid.

And text is relatively unimportant in an lmi census. The numbers
are what's most important. Names can be truncated.

Where does this lead us?

Ideally, I think we'd automatically widen columns as needed, to make all
numbers fit as they're entered--whether by typing their digits, or pasting
a scalar value, or even pasting a census...and maybe even when the case or
class defaults are modified and applied to all cells, as long as that
doesn't introduce a painful delay. Entering strings shouldn't cause the
column width to change at all.

It's useful to have a resize-column-width-to-fit verb as a menu command.
I suppose that a grid control inherently must store the current width of
each column in an array, as that is a prerequisite for rendering. When a
column's width changes, the corresponding array element changes. I don't
think we want each column to have a resize-on-every-change flag: the
resize-to-fit verb changes the evanescent state, and is then forgotten.
'excel' has an "AutoFit Column Width" menu command that seems to work
that way, AFAICT. It widens (or shrinks) columns to make strings fit,
too, and that's the right behavior.

> GC> >  Now there are other complications, e.g. the operation would still be 
> O(N),
> GC> > where N is the number of rows, if we wanted to also reduce the column
> GC> > width, but IMO this is much less important and we could avoid doing 
> this.

Most of the time, we're just changing one cell, so it's O(1):
 - calculate w_new, the width needed for the new value
 - compare to w_stored, the current width of the column
 - if w_new <= w_stored, do nothing; else w_stored = w_new, and re-render

> GC> I don't think it's very important to treat narrowing as a special case,
> 
>  I believe it's much more important to fully show the newly entered string,
> i.e. widen the column enough to fit it, than to narrow the column back if
> it becomes unnecessarily wide. Not doing the latter just wasts some space,
> while not doing the former prevents you from fully seeing your data, which
> seems much worse.

Okay, then we want the excel behavior, i.e.:
  automatically widen column to fit newly-entered data

>  In fact, if we don't do this, I think it would be reasonable to expect
> users to often hit Ctrl-] to adjust all columns widths after entering such
> overlong string, just to see it fully. And the problem is that doing full
> auto-size would take much longer[*] than just checking the single cell
> length, as I wrote above.
> 
> [*] To be pedantic, we could also cache row and columns best widths and
>     only recompute those that really changed, but this would be even more
>     complex.

Okay, the O(1)
  automatically widen column to fit newly-entered data
built-in behavior will relieve end users of the need to hit Ctrl-]
repeatedly. And it should be too fast to notice, as opposed to the
Ctrl-] verb which is O(#rows * #columns).

I'm not sure what you mean by caching "best" widths. Above, I had
convince myself that the wxGrid control knows the present width of
each column. How does the "best" width differ from that? Does it
differ only in that it may be narrower than the present width?
And if so, couldn't it so easily have become stale that we couldn't
rely on it?

> GC> > There are also operations affecting all cells, such as "Edit 
> class/case",
> GC> > but those are presumably less frequent than editing individual cells, so
> GC> > perhaps it's not as annoying that they take so long.
> GC> 
> GC> Instead of restricting the annoyance to certain cases, we should
> GC> eliminate it globally.
> 
>  As I tried to explain it above, I'm not so sure about this logic. You
> might train your muscle memory to hit Ctrl-] after entering long strings as
> it would work just fine for small censuses and then still be annoyed when
> you hit it without thinking in a big one and spend the next minute or two
> waiting until the program comes back to its senses.
> 
>  More I think about this, more I become sure that we really should put some
> time limit on auto-sizing wxGrid, whatever else we do, as it just shouldn't
> be possible for it to take arbitrarily long...

Rhetorical question: what should it do if the time limit is exceeded?

> GC> But auto-re-sizing is simply a misfeature, and that's all we need to
> GC> know.
> 
>  I'm not arguing for arguing sake, but I don't really understand this. If
> it's a misfeature, it should be eliminated entirely. But I don't think it
> is, and you don't seem to be committed to eliminating it neither, as you'd
> still like to allow auto-resizing the columns by pressing Ctrl-].

Here's what I think lmi does today, which I consider a misfeature:
 - maintain a boolean global state: autofit, or not
 - if "not", do nothing (so far, so good)
 - if "autofit", then whenever the grid's contents change, perform
     an all-encompassing O(#rows * #columns) resize (too costly),
     for all future changes, until "autofit" is turned off

Instead, I think we should adjust columns widths in this way:
 - if one string cell changes, do nothing: O(0)
 - if one numeric cell changes, widen its column if needed: O(1)
   do that whether the new value is typed in or pasted in
 - census pasting and applying cell or class changes seem to be
   O(#rows * #columns), so that's the tough case that requires
   more investigation and thought--a full resize would be nice,
   but might be too costly
 - when the user gives the autofit command: well, that could
   just take a while, and the best we could do is to make it as
   fast as possible

Instead of the autofit-all-rows-and-columns command, should we
follow the 'excel' behavior? AFAICS, that is:
 - if no column is selected, gray it out (actually, that's not
   what 'excel' does--it offers the command and pretends to
   execute it, without effect--but that's plainly a defect)
 - else, for all selected columns (and no others), perform an
   autofit: O(#rows * #selected) rather than O(#rows * #columns)

And then should we remove the "fixed width" menu command and
toolbar button in lmi? Perhaps it still has value: if you've
autosized all columns, and some string columns are more than
half a screen wide, is it important to have a button that gets
the grid back to a workable state with strings truncated?
I'm not so sure it's needed: 'excel' doesn't seem to have it.

> GC> It's curious that lmi today offers this choice:
> GC>  - change all column widths to fit
> GC>  - change all column widths to a fixed, hardcoded number
> GC> but not this one, which seems obviously desirable:
> GC>  - leave columns widths as they are
> GC> I.e., you can toggle from auto-re-size-always to reset-to-fixed-width,
> GC> but you can't toggle from auto-re-size-always to stop-auto-resizing.
> 
>  To make this more consistent, we would need to have a checkbox in the menu
> and a togglable toolbar item, but we don't seem to want to do this, as you
> want to go away from "auto-resizing mode" as a noun. But if it should be
> "auto-resize" as a verb, then it would make sense for there to be just a
> single menu command/toolbar button called "Fit column sizes to their
> contents" (but shorter).

Yes. We might call it "Autofit column width", for instance.

>  At the risk of wandering off into metaphysical territory, it doesn't make
> sense to have a UI element corresponding to "leave columns width as they
> are" because this can be accomplished by simply not having any special UI
> element for this and not doing anything.

[This response is deliberately left blank.]

> GC> > GC> Does that answer all your questions?
> GC> > 
> GC> >  It almost does. But, to be sure that I understood correctly, let me
> GC> > confirm which UI operations precisely will still need to auto-size the
> GC> > columns:
> GC> > 
> GC> > - Toolbar/menu command "Varying column widths" will do it, of course.
> GC> 
> GC> Yes. And autosize_columns_ is false by default, so the speed penalty
> GC> never arises unless the end user explicitly changes it.
> GC> 
> GC> It's a desirable optional behavior. But if it's painfully slow,
> GC> we should either make it fast enough, or remove the option.
> 
>  Personally I think that making it fast is a much better option.

Okay, then I've changed my mind, as above:
 - O(0) behavior for strings
 - O(1) behavior for single-cell changes: if that's noticeably
   slow, then we're doing something wrong
 - O(#selected columns * #rows) behavior on explicit demand only
 - O(#columns * #rows) behavior for census paste etc.: maybe,
   but only if it's always really fast; otherwise, preserve widths
   of old columns that persist, and use default width for any new
   columns introduced

> GC> > - Editing/adding/deleting cells will not do it, according to your 
> answer,
> GC> >   so there is no need to optimize doing it.
> GC> 
> GC> Agreed.
> GC> 
> GC> Unless, of course, you find a way to make it fast as lightning.
> 
>  For editing/adding, yes, we definitely can make it O(1). For deleting,
> this is obviously not possible in general, i.e. if we want to make it 100%
> precise (== "correct" in my initial message), but we could approximate it
> or just not do anything in this particular case.

AFAICT, if you delete an 'excel' column, the widths of the remaining
columns are unaffected. That sounds just right.

> GC> > - But what about the operations affecting all cells (and so already
> GC> >   taking O(N) time), such as editing class/case or pasting census
> GC> >   from clipboard, should they still resize the columns to fit their
> GC> >   contents?
> GC> 
> GC> No.
> 
>  OK, thanks.

As mentioned a few paragraphs above, I now think we should probably
preserve the present widths of previous columns that remain after
such a change, and default the widths of any new columns introduced
by such an operation. Resizing all columns is likely to take too long.

> GC> AFAICS, CensusView::autosize_columns_ is just a handle for the
> GC> wxDVC control's wxCOL_WIDTH_AUTOSIZE flag. A brand-new wxGrid-based
> GC> census manager could do something different.
> 
>  Yes, of course. Right now it behaves in the same way as wxDVC version
> does, but I'll change it not to do auto-sizing automatically at all for
> now. I hope we can return to this when the initial PR is merged, however,
> to implement this, IMHO quite useful, functionality in a reasonably
> efficient way.
> 
>  So, to summarize: in the initial version there will be no automatic
> auto-sizing at all. I'll still change wxGrid to not spend too long in its
> AutoSize() because this can be too annoying even if it's triggered
> manually, but this won't affect lmi code complexity in any way.

For an initial version, it's perfectly okay to go with simple behavior.

>  Of course, please let me know if you disagree with anything here. TIA!

I guess I've disagreed with much of what I'd previously written, but
such is the unity and interpenetration of opposites.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]