[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lmi] Calculation summary speed
From: |
Greg Chicares |
Subject: |
Re: [lmi] Calculation summary speed |
Date: |
Tue, 31 Oct 2006 01:21:16 +0000 |
User-agent: |
Thunderbird 1.5.0.4 (Windows/20060516) |
On 2006-10-31 0:15 UTC, Evgeniy Tarassov wrote:
>
[speedups include...]
> some modifications applied to xsl templates used to generate html
> and TSV calculation summary output
>
> 1) i have slightly changed the code to read for every ledger value the
> 'calculation_summary' flag from 'format.xml'. The optimisation in C++
> consisted of skipping the most time-costly place -- method that
> formats double vectors into string a vector. The method generates the
> output for a column only if:
> - either the value is explicitly marked by calculation_summary in
> 'format.xml'
> - or the column is in he supplemental_report value set
I think it would be better to use only columns specified in a new
'calculation_summary_columns' entity in 'configurable_settings.xml'.
See:
http://lists.gnu.org/archive/html/lmi/2006-10/msg00064.html
> All these values are inserted into the output xml if we generate a
> full version of xml (for pdf generation).
> The single double values and string vector values are still included in
> the xml.
Are there so few string-vector values, that conditionally excluding
them wouldn't make it noticeably faster?
> 2) XSL template optimisation
Okay--filtering out unneeded nodes first, before looking up the
few nodes actually needed to generate the final table, was the
key to the optimization. Thanks for the detailed explanation.
> Some other comments:
>> From the beginning we were discussing the inclusion of
> supplemental_report into ledger xml data vis-a-vis passing
> supplemental_report columns as parameters to the libxslt
> transformation engine.
Almost--the original idea was to use a 'configurable_settings.xml'
entity
<calculation_summary_columns>
some_column_name
some_other_column_name
...
</calculation_summary_columns>
as the set of parameters.
> It turns out that in general passing
> information as parameter is discouraged in the xslt community and is
> generally used to debug/tweak xsl templates (for example pass an
> optional debug parameter to make template print more debug information
> in case some information is missing or in some sort of unusual
> situation).
> If i understand it correctly Lmi as it is now does not have such a
> global switch configurable at runtime (via
> configurable_settings.xml?). Do you think that it could be intersting
> to add such a feature (a new 'debug' flag in
> 'configurable_settings.xml'), which will make xslt print warning in
> red in case any column is missing or something unexpected is going on
> with ledger xml code?
That could be a valuable idea for the future. I have done so
little work with xslt myself that I can't say anything very
insightful.
> Since we are putting supplemental_report columns into ledger xml, we
I think we should move away from that idea, for the calculation
summary, as discussed above.
> have to regenerate xml data every time the user changes
> supplemental_report columns,
That's okay.
> no (simple) caching could be done in
> ledger_text_formats.?pp.
That's okay. Now, I guess that you were using lazy evaluation
to implement caching; as you point out, that's not really
beneficial any more. I suppose it would have helped if an end
user reran an illustration after changing only the selection
of calculation-summary columns; that would actually be rare in
practice, and need not be optimized. Typically, they'll change
other input fields between runs, so that the calculations will
produce a different result--and the xml will need to be
regenerated anyway.
But it's good to understand that. When I saw lazy evaluation,
I originally guessed that it somehow simplified the code to
write in a 'functional' idiom--e.g., that you could write
unconditional statements for generating the data, and then
only the necessary data would actually get generated. So I
figured you were thinking in a functional language while
writing C++, which is perfectly okay. Here:
'input_harmonization.cpp' [Input::DoHarmonize()]
is a lot of code that *wants* to be declarative, and I've
toyed with the idea of using FC++
http://www-static.cc.gatech.edu/~yannis/fc++/boostpaper/fcpp.html
for that but never had the time.
> I have removed the cache making the 'Prepare'
> phase useless. Which means that in the latest result Greg have posted:
>
>> > total calculate prepare format
>> > modified: 430 150 30 250 "coarse sketch"
>> > new': 549 154 299 96 xsl optimization
>> > the winner: 278 152 30 96
>>
>> new'': 322 157 20 145 20061030T1836Z cvs
Again, thanks for the explanation.
> the third column ('prepare') does not measure the xml preparation time
> -- it is now included in 'format' phase, therefore the slight
> degradation in 'format' phase speed.
>
> Greg--do you want me to remove the 'prepare' phase timer from
> illustration_view.cpp?
Well, it still measures something: SetLedger(). I'd say it
does no harm to leave it there, and maybe the information
will be useful somehow, someday; if not, we can always
remove it later.