[lmi] Number formatting patterns

lmi

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lmi] Number formatting patterns

From:	Evgeniy Tarassov
Subject:	[lmi] Number formatting patterns
Date:	Tue, 14 Nov 2006 16:52:19 +0100

I think this should be posted as a new topic. The original message
thread could be found here:
http://lists.gnu.org/archive/html/lmi/2006-09/msg00008.html

On 10/29/06, Greg Chicares <address@hidden> wrote:

On 2006-10-29 14:46 UTC, Evgeniy Tarassov wrote:
>        <xs:restriction base="xs:string">
>           <!-- update this regular expression for every change in formatting 
rules -->
>           <xs:pattern value="(-?[1-9][0-9]{0,2}(,[0-9]{3})*)|0" /><!-- f0 -->
>           <xs:pattern value="(-?[1-9][0-9]{0,2}(,[0-9]{3})*)|0" /><!-- f1 -->
>           <xs:pattern value="((-?[1-9][0-9]{0,2}(,[0-9]{3})*)|0)\.[0-9]{2}" 
/><!-- f2 -->
>           <xs:pattern value="((-?[1-9][0-9]{0,2}(,[0-9]{3})*)|0)%" /><!-- f3 
-->
>           <xs:pattern value="((-?(([1-9][0-9]{0,2}(,[0-9]{3})*)|0)\.[0-9]{2}))%" 
/><!-- f4 -->
>           <xs:pattern value="((-?(([1-9][0-9]{0,2}(,[0-9]{3})*)|0)\.[0-9]{2}))bp" 
/><!-- bp -->
>        </xs:restriction>

At a glance, I don't see anything obviously wrong with that
translation of the original hardcoded formats to regexes. To
guarantee that would require rigorous testing.

However, I think hardcoding the formats was a poor idea in the


[stripped]

Instead of using schema validation to enforce an obsolescent
paradigm that was too restrictive and inexpressive--and updating
the schema every time we need to be more expressive--I had hoped
we could use a new paradigm that's designed for the flexibility
we'd really like to have:

[http://lists.gnu.org/archive/html/lmi/2006-09/msg00005.html]
On 2006-9-14 16:19 UTC, Greg Chicares wrote:
| On 2006-9-13 14:21 UTC, Vadim Zeitlin wrote:
|> 3. Create another XML file containing the information about the columns
|>    (their names, titles) and format of the data. Currently it would have
|>    to be modified manually if the required report format changes but the
|>    plan is to allow customizing it from the GUI in the future (i.e. add
|>    a "Report format" dialog which would generate this XML file dynamically)


Let me just rephrase what we do want for the number formatting:

For each double value (or a vector of double values) in the ledger we
want to specify formatting pattern in the 'format.xml' file.
We want the number formatting code in C++ to be generic and to allow
the minimal subset of existing formats (F1-F4 and BP) and ideally to
be able to specify other formatting patterns, so that we are not
forced to recompile the software to add new or change an existing
formatting pattern for a value.

I think the good candidate is the formatting patterns used in XSLT
format-number function.
The spicifation for the function from W3C:
http://www.w3.org/TR/xslt#function-format-number

Pros and cons of the approach.

Pros:
+ Flexibility. These patterns could be used directly in the xls
templates, which gives us more liberty in the future. We could even
consider again (someday) pushing the number formatting into libxslt,
and that won't necessit no changes to the format.xml file which could
reside on the client side and could be customized to the client needs
already.
+ The format is well known and already tested very well.
+ We already have an implementation of the formatting using these
pattern syntax. File numbers.c from libxslt provides the c++
implementation:
http://cvs.gnome.org/viewcvs/libxslt/libxslt/numbers.c?view=markup (search for
xsltFormatNumberConversion).
+ Format originates from Java -- java.text.DecimalFormat class. It is
also imlpemented and well-tested by the java community.
(http://java.sun.com/j2se/1.4.2/docs/api/java/text/DecimalFormat.html)
This link contains the format specification, which is carefully
followed by libxslt implementation.
+ Stability. If you have a look at libxslt/numbers.c file in gnome cvs
you'll see that the last change to the function is 16 months old.
Which means that the probability of a bug in the implementation is
really small.
+ We don't invent the wheel. Means no need for rigorous testing, no
time is needed to implement the thing.

Cons:
- The format is much more powerfull than what lmi could possibly need.
This means that the same (possibly complex) pattern will be
copy/pasted throughout the entire format.xml file.
Possible workaround: use indirect format specification, like so
(snippet from format.xml):
<columns>
   <format id="f1">#,###.</format>
   <format id="f4">#,###.##</format>
...
   <column name="Age" format_id="f1" />
</columns>
It will add some additional processing (at startup) but will completly
solve the readability  problem (for format.xml) and will avoid the
possible errors from copy/pasting.

- A code snippet from another project. I have read the license -- i'm
not 100% sure but i think we can use/modify freely the code al long as
we maintain libxslt copyright in the file. So this should not be the
problem.

Currently we use the hardcoded double numbers formatting patterns.
Using the described above syntax will demand for some (not-so-complex)
modifications to the existing code and we will also need to intergrate
the xsltFormatNumberConversion function from libxslt/numbers.c into
the project (preserving the original libxslt copyrights).

What do you think about this?

[Prev in Thread]

Current Thread

[Next in Thread]

[lmi] Number formatting patterns, Evgeniy Tarassov <=

Prev by Date: Re: [lmi] Resetting hard-coded defaults before loading 'configurable_settings.xml'
Next by Date: Re: [lmi] Which is the best C++ wrapper for libxml2?
Previous by thread: [lmi] Should xml files be installed to a separate directory?
Next by thread: [lmi] Enumerating open technical issues
Index(es):
- Date
- Thread