lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Replace XSL with wxPdfDocument?


From: Greg Chicares
Subject: Re: [lmi] Replace XSL with wxPdfDocument?
Date: Wed, 04 Nov 2015 02:15:37 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.3.0

On 2015-11-03 23:12, Vadim Zeitlin wrote:
> On Tue, 03 Nov 2015 19:51:46 +0000 Greg Chicares <address@hidden> wrote:
> 
> GC> Our experimental use of wxPdfDocument for group quotes seems to have
> GC> been a great success. I think I spent plenty of time myself working
> GC> in the code that generates PDFs, and I don't remember anything PDFish
> GC> about it, so I went back and examined it again...and it really is
> GC> transparent. It looks like the way we used to draw things on the
> GC> screen in msw-1.0 thirty years ago--picking brushes and drawing into
> GC> device contexts. I couldn't have written the group-quote code myself,
> GC> but I can maintain it easily--it all seems so clear as to be obvious.

This is how to recognize superior programmers: you praise their work, and
in reply they tell you all their misgivings.

>  I have to say that I hesitated for quite some time between using
> wxPdfDocument API directly and using wxPdfDC, which implements wxDC API on
> top of wxPdfDocument. I finally chose the latter because it's higher level
> and so meant I could do the work faster and also because it allowed me to
> use the existing wxHTML code to render HTML directly into PDF -- which was
> another productivity boost. Finally, a potential, although not currently
> realized advantage of using wxPdfDC, is that the same could could be used
> to preview the PDF on screen and I thought it could be useful to allow
> doing this in the future.

Let me say a big "amen" to that.

Just today, we (here in the US) were talking about a terrible problem we had
years ago. Someone's msw box wasn't configured to view PDF files automatically
(it was missing whatever registry key wxMimeTypesManager uses). This box was
in a satellite office of a broker whose head office was a thousand miles away,
and that extra degree of separation made long-distance debugging even worse
than it usually is. That cost us many hours of effort. We don't just have an
external dependency on a pdf viewer; we also have a dependency on the msw
registry. Those dependencies can change in unimaginable ways: often these days
I see messages on the Cygwin mailing list that say msw-10 updates have broken
something that worked fine for decades, so the msw universe is in accelerated
decay. Removing external dependencies is a good thing.

You mention speed and responsiveness below as further advantages.

And if we have this capability inside lmi, we may find uses that are difficult
even to imagine today. For example, perhaps it becomes fast enough that we can
offer it as an alternative to the present calculation summary.

>  But there are drawbacks in using wxDC-compatible API instead of native PDF
> one as well. Mainly they're due to the fact that wxDC API is ancient and
> doesn't support many features of modern graphics APIs, such as (but not
> limited to)
> 
> - Arbitrary affine coordinate system transformations.
> - Alpha channel (transparency).
> - Non integer coordinates.
> - Paths.
> 
> All of those happen to be supported by the PDF format and while I am not
> sure if we need either of them right now, it might be unwise to restrict
> ourselves to a couple of decades old API without any support for them.

Our needs are permanently so limited that at one point I considered using
wxHTML for all reporting. Our output is tables of numbers, with text headers
and footers and an occasional bitmap like a company logo thrown in. This is
never going to change.

I can't imagine we'll ever need transparency. We don't do charts and graphs;
that's a job for spreadsheets, to which we already export data in enough
different ways. We'll never do a better job than spreadsheets; we shouldn't
even think of trying.

"Paths": do you mean the stuff that resembles turtle graphics--start at
point (100,100), draw a line segment to (50,50), and so on? We've never done
anything more complicated in that vein that the box you drew around the
group-quote "Summary".

As for non-integer coordinates, and coordinate transformations, I'll defer
to you; I don't see why we'd need them. We don't turn squares into trapezoids;
we just print tables of numbers along with a little text. Scaling the company
logo for group quotes was the most ambitious affine transformation we've ever
aspired to.

>  OTOH I still think that the possibility to use the same code to generate
> PDFs and show them on screen for previewing is quite nice. Of course, lmi
> currently allows to "postview" the generated PDF files which is not that
> different but I think previewing them before the generation could be even
> nicer. If nothing else, it should be much faster for big PDFs as we
> wouldn't have to generate the entire PDF at once but could start previewing
> immediately (well, after generating the first page).

Responsiveness is good, and the speedup could be dramatic.

>  So it's not totally clear to me whether the new PDF generation code should
> use wxDC-compatible API or directly work at PDF level. There is also
> another possibility: implement a wxGraphics backend using the low-level PDF
> API. This would retain the advantage of allowing to use the same code for
> PDF generation and previewing while giving us access to all the features
> above as they're support by wxGraphics. This would, however, require extra
> work as such wxPdfGraphics doesn't exist (although wxPdfDocument
> documentation says it's planned...).

Let me summarize:

wxPdfDC advantages over wxPdfDocument
  higher level: faster delivery, smaller code, less cost, easier maintenance
  lets us render HTML directly into PDF, leveraging wxHTML
  could be used to preview PDF inside lmi
wxPdfDC disadvantages
  wxDC API is ancient
  wxDC API lacks some nice stuff that lmi will probably never need

wxPdfDC seems to win by a landslide. When you say "ancient", that does worry
me, as it may connote "not well understood" and "unmaintained". Let me ask
you some questions about this and wxPdfGraphics off the list.

> GC> Most of our PDFs are created with XSL-FO, which is, in comparison, is
> GC> incomprehensible. Can we translate all that old XSL-FO junk to clean
> GC> wxPdfDocument code?
> 
>  We definitely could, the question is just about how exactly to do it. One
> strategic question is whether we want to put everything in the XSL files in
> C++ code or if we still want the PDF format/layout to be configurable via
> the external files? In principle, the idea of not hard coding all this
> seems quite attractive to me, IMHO it's just the awfulness of the XSL
> syntax that makes it unappealing in practice and I guess we could come up
> with some much simpler and saner external format defining the document to
> output. Or do you think this would just lead us to reinvent XSL, poorly (a
> truly horrifying thought)?

I was about to say that this would just lead us to reinvent XSL.

PDF illustrations certainly do have a structure. They're composed of tables.
Each table is essentially a numeric matrix with so many rows that it won't
fit on a single physical page, so we break it into chunks (with a blank line
after every five years, e.g.) that, with repeated headers and footers, are
written to physical pages. All the physical pages are numbered, e.g., as
"Page 3 of 7 pages"; that's a legal requirement. And we add a logo to, e.g.,
the first or last page. Create some samples and take a look, or we'll send
you a representative set of sample PDFs if you prefer; I don't think there's
any more to it than that (today or ever).

My inclination would be to map this structure onto imperative C++ code:
  WriteAllTables
    WriteOneTableSpanningMultiplePhysicalPages
      WriteOnePhysicalPageOfOneTable
Comprehensible. Flexible. Extensible. Maintainable. Transparent.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]