lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] [lmi-commits] [6323] Use unique_filepath() for all PDF and spr


From: Greg Chicares
Subject: Re: [lmi] [lmi-commits] [6323] Use unique_filepath() for all PDF and spreadsheet output
Date: Wed, 07 Oct 2015 01:17:39 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.3.0

On 2015-10-06 20:57, Vadim Zeitlin wrote:
> On Tue, 06 Oct 2015 15:17:19 +0000 Greg Chicares <address@hidden> wrote:
> 
> GC> Modified: lmi/trunk/emit_ledger.cpp
[...ctor initializer list became...]
> GC> +    :case_filepath_              (case_filepath)
> GC> +    ,tsv_ext_                    
> (configurable_settings::instance().spreadsheet_file_extension())
> GC> +    ,case_filepath_spreadsheet_  (unique_filepath(case_filepath,         
>     tsv_ext_))
> GC> +    ,case_filepath_group_roster_ (unique_filepath(case_filepath, 
> ".roster" + tsv_ext_))
> GC> +    ,case_filepath_group_quote_  (unique_filepath(case_filepath, 
> ".quote.pdf"        ))
> GC> +    ,emission_                   (emission)
> GC> +{
> GC> +    LMI_ASSERT(!case_filepath_.empty());
> GC> +}
> 
>  I have a small problem with this change as it always initializes
> case_filepath_xxx_ objects, even when we don't need any of them -- and we
> never need all of them at once. This is not actually harmful, of course,

Yes, that's why I thought it acceptable.

> but it's still a gratuitous performance pessimization and while it probably
> isn't normally noticeable in practice, unique_filepath() is still an
> expensive function to call as it does at least one and possibly 3 file
> operations.

In the usual case, boost::filesystem::exists() is called once for each
of the three uniquified elements in the initializer-list. The CLI can
potentially require all three at once, but the GUI needs only one (or
zero) of those emission types at any invocation (and the other two (or
all three) are "wasted").

I'm not sure we should worry much about the less common case where one
(or more) of these three files is open in a viewer that locks it. In
that case, we make three filesystem calls:
  boost::filesystem::exists()
  boost::filesystem::remove()
  boost::filesystem::exists()
How slow can they be? Do they really add more overhead than the
(unfortunately necessary) try-catch?

> And I could see it becoming noticeable e.g. when using a slow
> network disk[*].

We distribute lmi on a LAN, but home-office users copy everything to
their local HDDs for performance reasons. Of course, the last time I
personally ran lmi in an office, I ran it on a "computer", whereas
nowadays they're "devices", and perhaps they've virtualized each
user's HDD so that it really is on a LAN somewhere.

OTOH, I think the broker for whom we wrote 'custom_io_1.?pp' might
actually run lmi from its LAN. But they never use either of the
spreadsheet outputs, though they occasionally write a PDF file.

OTTH, if you're viewing an output file in a way that locks it, and
the file's on the LAN, then...you loaded it into your viewer from
the LAN, and you're in a world of hurt anyway, so...should we worry
about even three times three accesses to filesystem metadata? Given
that, I guess I'm more concerned about three filesystem calls on an
invocation that needs none of them.

>  So ideally I'd only initialize these paths on demand, i.e. when they're
> first needed. Would it be useful to make a patch doing this?

Let's see...we could move everything into the body of the ctor,
and initialize each path conditionally:

    if(emission_ & mce_emit_spreadsheet)
        {
        case_filepath_spreadsheet_ = unique_filepath(case_filepath, tsv_ext_))
        }
    if(emission_ & something else ...)
        { ... and so on

If that's all, then...okay, I could sacrifice that much terseness in
return for removing even a theoretical problem, and I'd apply a patch
that does only that and little to nothing more. But I certainly don't
want to ask you to prepare a more complicated patch that I won't wish
to apply because from my POV it makes the code much more complex; so
if you have something more ambitious in mind, let's discuss it first.

> [*] I'm afraid I'm showing my age here, it's supposed to be called "the
>     cloud" now I believe.

Plus ça change. As soon as HDDs become large, fast, and cheap, we
forsake them for slow LANs. As soon as we have fast fiber-optic LANs,
we forsake them for the next new thing: wifi I guess. But you're right
to bring this up, because someday end users may run lmi "in the cloud",
and there's no reason not to think ahead.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]