lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Calculation summary speed


From: Evgeniy Tarassov
Subject: Re: [lmi] Calculation summary speed
Date: Tue, 31 Oct 2006 01:15:51 +0100

> Here are some precise timings--the average of five runs:
>
>            total  calculate prepare format
> old':        915        158     301    456 before latest change
> new':        549        154     299     96 current cvs
>
>            total  calculate prepare format
> modified:    430        150      30    250 "coarse sketch"
> new':        549        154     299     96 xsl optimization
> the winner:  278        152      30     96
>
> then we get 278 milliseconds: quadruple the speed of the
> initial implementation, and only a tenth of a second slower
> than what's in production.

new'':        322        157      20    145 20061030T1836Z cvs

It's fast enough that I can feel the unmetered overhead of
something completely extraneous that I haven't gotten around
to speeding up yet in the MVC Model. IMO it's okay to pay
one hundred fifty ms for the flexibility that's been gained,
and we don't need to make this xml stuff any faster.

I think i should post some comments on the calculation summary
formatting speed. I should have it done it before, but i think it is
better to do it late than never.

Two speed ups were applied in the submitted changes:
1) C++ code: the approach Greg has took to boost the xml preparation
time (http://lists.gnu.org/archive/html/lmi/2006-10/msg00037.html)

2) some modifications applied to xsl templates used to generate html
and TSV calculation summary output

1) i have slightly changed the code to read for every ledger value the
'calculation_summary' flag from 'format.xml'. The optimisation in C++
consisted of skipping the most time-costly place -- method that
formats double vectors into string a vector. The method generates the
output for a column only if:
- either the value is explicitly marked by calculation_summary in 'format.xml'
- or the column is in he supplemental_report value set

All these values are inserted into the output xml if we generate a
full version of xml (for pdf generation).
The single double values and string vector values are still included in the xml.
I have forgot to mention the special optimisation for 'IrrCsv' and
'IrrDb' which are only calculated for a full version of xml output
(e_xml_full flag to Ledger::do_write method).

2) XSL template optimisation which consisted of preparing and
arranging the nodeset of columns we need to insert into the data
table.

Originally tables where genrated using the straight approach (a
simplyfied code, and thus invalid -- only to demostrate the idea):

<xsl:for-each select="$Outlay_column/duration">
<tr>
<xsl:for-each select="$selected_columns">
 <td>
   <xsl:value-of
select="$illustration/address@hidden/@address@hidden/@basis]/duration[position()]/text()"
/>
 </td>
</xsl:for-each>
</tr>
</xsl:for-each>

The bottleneck is the inner XPath expression
'$illustration/address@hidden/@name]/duration[position()]/text()'
which scans all the nodes of '/illustration' looking for the one that
corresponds to the selected_column. This search was done for every
cell in the table.

The optimized version of xsl templates prefilters the nodes to output
-- first it creates a nodeset consisting of the double_vectors or
string_vectors which are referenced in $selected_columns (columns to
print in the table) and only then generates the table, thus doing the
preparation only once and for each cell doing a search through a much
smaller set of nodes. If this explanation is cumbersome and you want
to understand it, i'll prepare an simplifide example with comments.

Some other comments:
From the beginning we were discussing the inclusion of
supplemental_report into ledger xml data vis-a-vis passing
supplemental_report columns as parameters to the libxslt
transformation engine. It turns out that in general passing
information as parameter is discouraged in the xslt community and is
generally used to debug/tweak xsl templates (for example pass an
optional debug parameter to make template print more debug information
in case some information is missing or in some sort of unusual
situation).
If i understand it correctly Lmi as it is now does not have such a
global switch configurable at runtime (via
configurable_settings.xml?). Do you think that it could be intersting
to add such a feature (a new 'debug' flag in
'configurable_settings.xml'), which will make xslt print warning in
red in case any column is missing or something unexpected is going on
with ledger xml code?

Since we are putting supplemental_report columns into ledger xml, we
have to regenerate xml data every time the user changes
supplemental_report columns, no (simple) caching could be done in
ledger_text_formats.?pp. I have removed the cache making the 'Prepare'
phase useless. Which means that in the latest result Greg have posted:

>            total  calculate prepare format
> modified:    430        150      30    250 "coarse sketch"
> new':        549        154     299     96 xsl optimization
> the winner:  278        152      30     96
>
> then we get 278 milliseconds: quadruple the speed of the
> initial implementation, and only a tenth of a second slower
> than what's in production.

new'':        322        157      20    145 20061030T1836Z cvs


the third column ('prepare') does not measure the xml preparation time
-- it is now included in 'format' phase, therefore the slight
degradation in 'format' phase speed.

Greg--do you want me to remove the 'prepare' phase timer from
illustration_view.cpp?


Eugene




reply via email to

[Prev in Thread] Current Thread [Next in Thread]