lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Embed {{MST}} and <html> in product database


From: Vadim Zeitlin
Subject: Re: [lmi] Embed {{MST}} and <html> in product database
Date: Fri, 26 Jul 2019 00:53:33 +0200

On Wed, 24 Jul 2019 18:10:26 +0000 Greg Chicares <address@hidden> wrote:

[..snip my bad idea...]
GC> For example, today we have a <ProductDescription> element whose content
GC> varies from one proprietary product to another. It's neatly split into
GC> paragraphs in the source code that generates the '.policy' files. But
GC> that formatting doesn't come through into PDF output--instead, the
GC> paragraphs all run together into an unattractive giant blob of text.
GC> 
GC> How can blobs like that be paragraphed as intended?
[...]
GC> Thus, embedding pilcrows in '.policy' files does, in a
GC> sense, to some degree, violate an ideal separation of concerns;

 Actually, I think that it doesn't, because, as you convincingly explained,
nobody thinks about the "product description" as 4 paragraphs of text, but
rather as a single multi-paragraph text. So in this case I would be
perfectly happy to just handle line breaks in the value intelligently and
convert them either in "</p><p>" or, maybe, just "<br>" in HTML.

GC> But let's try looking at this in a different way--not as two levels
GC>   {content, formatting}
GC> but as three:
GC>   {content, structure, presentation}

 I'm not sure if this is really helpful because content+structure still
remain intertwined together in the .policy files.

GC> Alternatively, maybe we could use '_' and '||': i.e.,
GC>   strong
GC> which is in broad general usage already, and
GC>   end of old paragraph||beginning of new paragraph
GC> which is apparently what Sanskrit uses for a pilcrow. And this
GC> "structural" markup could be translated in a distinct C++ function,
GC> so we're never writing html markup in '.policy' files.

 You're basically proposing to use Markdown in policy files: it honours the
explicit line breaks between paragraphs (while still soft-wrapping the
paragraphs themselves) and provides basic markup with "_", "*" and "`" as
special characters. I do agree that Markdown is much better than HTML, even
though it's still not as simple as I'd like (and while we could implement
support for just a very limited Markdown subset, of course, I'd be
surprised if people didn't start complaining about the missing parts).

GC> > GC> How can we move forward now, without that much labor?
GC> > GC> 
GC> > GC> Of the two ideas presented, for strings in '.policy' files:
GC> > GC>  (1) allow mustache substitutions
GC> > GC>  (2) allow markup for boldface and paragraphing
GC> > GC> it seems that:
GC> > GC> 
GC> > GC>  - You don't strenuously object to (1)...so can we decide now how to
GC> > GC> implement it? E.g., invoke a function twice, as in my experimental
GC> > GC> patch; or rewrite the function so that it doesn't need to be called
GC> > GC> twice?
GC> > 
GC> >  It looks like we should use Mustache partials for this: they're exactly
GC> > what is used for including strings in Mustache syntax from elsewhere. But
GC> > this would make sense only if the string came from the policy file
GC> > directly, we could then (easily) implement something like 
{{<policy:field}}
GC> > to get it from there and expand.
GC> 
GC> Such is not the case. Recall that product parameters are embodied
GC> principally in two types of files:
GC>  - '.database': numeric data
GC>  - '.policy': string data

 Just to be clear, I only suggested using partials for .policy files fields
and even then only for those for which it's necessary to do it. I.e. for
simple fields, not requiring Mustache interpolation of their contents, we'd
still continue to use just {{field}} syntax and {{<policy:field}} would be
available in addition to it.

 Do you still object to doing it even so?

GC> It's not a choice of starting to do this now versus at some time in
GC> the future. The point is that we want to avoid doing this at all.
GC> To explain why, let's step back and start a sidebar discussion here.
[...snip sidebar...]

 Thanks for the explanation, I'm convinced now that what you propose is
indeed the only possible way forward and, moreover, I don't find it so bad
knowing that it's used for things like emphasizing some strings in the
running text.

 I still don't feel that great about encoding list item markup in the
policy files, but I guess I'll just have to live with it.

GC> Some existing work, like 'InforceNonGuaranteedFootnote[0-3]' above,
GC> effects such a separation of concerns, but that cure is worse than
GC> the disease: the price we pay for that rigidity is fragmenting the
GC> content, and constructing channels to pass an unwieldy number of
GC> fragments individually along a chain
GC>   product files --> ledger classes --> mustache templates --> PDF
GC> with distinct variables for all the fragments. That's just unworkable.

 Yes, agreed.

GC> Yes, no matter what particular characters we choose, the more I think
GC> about this, the clearer it becomes that pilcrows and guillemets are
GC> sufficient and less offensive than other approaches. I'm thinking
GC> that "||", "<<", and ">>" might be our best option because they're
GC> pure ASCII, and should never arise in descriptions of life insurance
GC> (which never use "||" for logical OR, or "<<" for "much less than").

 I really think we should go with the standard conventions (used by
Markdown but really predating it for a couple of decades in common use) and
use "_" and/or "*" for emphasis. And I think even some lawyers would be
familiar with this use, unlike the use of "«" or "¶".

GC> No map (see above); but I'm glad you wrote this anyway, because it
GC> points out that the extra step of processing whatever guillemets,
GC> pilcrows, and {{partials}} may be contained in '.policy' strings is,
GC> conceptually at least, a distinct step--so should it be a physically
GC> distinct step, e.g.
GC>   left  guillemet --> <strong>  (or  <em>, or  <b>)
GC>   right guillemet --> </strong> (or </em>, or </b>)
GC> applied only to strings taken from '.policy' files?

 Yes, I definitely still see value in not hardcoding HTML tags in them
directly.

GC> I therefore change my original proposal accordingly, withdrawing the
GC> <p> and <b> suggestion in favor of non-html, structural-only (i.e.,
GC> non-presentational) markup. To use any html at all would be to take
GC> the first step down a slippery slope, so we'll just rule that out
GC> categorically.
GC> 
GC> Are we ready to proceed to the implementation details?

 There is still the question of whether to use Mustache partials or not.
This, of course, affects only the use of MST in the .policy files, but not
HTML/markup, so if this has higher priority (as I guess it might), then the
answer to the question above is "yes".

 Moreover, I think this discussion could be rather short: let's just
implement support for _very_ minimal Markdown subset right now. The only
real question I have is about error checking/reporting: what would be the
best way to deal with things like "This is _important, even if the closing
underscore has been forgotten"? Worst would be to generate an unclosed <i>
(or whatever) tag, but we won't do this. But should we ignore this lone
underscore (i.e. drop it), preserver it verbatim, warn the user about it
(but they don't control the .policy files contents, so what good would this
warning do to them?) or try to fix it automatically (tempting, but
potentially dangerous)?

 Please let me know what do you think about this, thanks in advance,
VZ

Attachment: pgpEhzxgZHmFf.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]