lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Embed {{MST}} and <html> in product database


From: Vadim Zeitlin
Subject: Re: [lmi] Embed {{MST}} and <html> in product database
Date: Sat, 27 Jul 2019 20:32:32 +0200

On Sat, 27 Jul 2019 17:53:35 +0000 Greg Chicares <address@hidden> wrote:

GC> On 2019-07-27 00:29, Vadim Zeitlin wrote:
[...]
GC> But let's be clear about who sees such markup, whether it be
GC> 
GC>      '\n', "¶", or "||" or "{{paragraph}}" for <br>
GC> 
GC>      "_abc_", "«abc»", "「abc」", or "{{em}}abc{{/em}}" for <strong>

 Yes, I realize that it's not seen by the end users. But I'm still in
favour of using clear and more or less self-documenting formats rather than
hermetically confusing ones when I have a choice. It looks like this battle
is lost, however, so I'll just give up and maybe hope that Kim is not as
eager to use weird Unicode symbols as you're and that this might still
affect your decision.

GC> so could you reimplement interpolate_string() to accept configurable
GC> (single- or multiple-character, ASCII or multibyte) tokens, e.g.
GC>   std::string const strong_open   = "<<";
GC>   std::string const strong_close  = "<<";
GC>   std::string const new_paragraph = "||";
GC> so that Kim and I can experiment with alternatives and pick the one
GC> we find most convenient (which could be hardcoded later)?

 Sure, except I don't think this belongs to interpolate_string() at all.
This will be some other function in PDF generation code, as this layer is
different from string interpolation.


GC> >  I'm even more sure we don't need the pilcrows because we can just use
GC> > raw C++11 strings instead:
GC> > 
GC> >   std::string const s = R"(
GC> >           This is one paragraph.
GC> > 
GC> >           And this is a separate paragraph.
GC> >           )";
GC> > 
GC> > (and here spaces insignificance in HTML actually plays for us because we
GC> > can indent the string in any way we want -- or not).
GC> 
GC> Interesting idea, but it's not going to work for us.

 I gave up above, but I'm still going to try to argue this one: I think it
will work just fine.

GC> Consider the effect at various steps:
GC>   (a) source code that generates '.policy' files
GC>   (b) '.policy' files themselves
GC>   (c) internal variables used to generate PDF output
GC>   (d) PDF output files
GC> Such indentation is erased (by HTML) before step (d). Step (c)
GC> involves only transient data, where extra spaces can only be
GC> seen with a debugger. In step (a), it might not seem to matter.
GC> But step (b) involves physical files, where extra spaces matter.
GC> Very often these days we're making changes in (a) that are
GC> carefully designed to have no effect in (b), which makes
GC> acceptance testing trivial; it wouldn't be trivial if (b) could
GC> change in a way that would be equivalent but not identical. And
GC> of course a GUI product editor operates on the (b) files, so it
GC> would show all the superfluous whitespace.

 I'm sorry, but I don't understand this argument from the beginning to the
end. Indentation is absolutely not material here, you may use it or not, it
doesn't matter (just as I wrote above). So if you don't want to have any
indentation in the .policy files, let's not have it there, there is
absolutely no problem with this.

 The only important thing here is the blank line between the paragraphs.
And this line shouldn't disappear in any of the intermediate steps.

 Do you still think there is a problem here? Either I'm missing something
very obvious or there was some misunderstanding here, because I just don't
see any problem at all.

GC> > GC> >  Just to be clear, I only suggested using partials for .policy 
files fields
GC> > GC> > and even then only for those for which it's necessary to do it. 
I.e. for
GC> > GC> > simple fields, not requiring Mustache interpolation of their 
contents, we'd
GC> > GC> > still continue to use just {{field}} syntax and {{<policy:field}} 
would be
GC> > GC> > available in addition to it.
GC> > GC> > 
GC> > GC> >  Do you still object to doing it even so?
GC> > GC> 
GC> > GC> Yes, because I see no advantage to doing so.
GC> > 
GC> >  One immediate advantage is that we avoid double interpolation that you've
GC> > been (understandably) unhappy about.
GC> 
GC> That's an aesthetic shortcoming, but I'm not sure whether
GC> it's of any practical importance.

 Well, simple code is usually preferable to more complicated code, so it's
more than just aesthetic. And there is another advantage in doing it for
recursive expansion, as described in the parallel "Empty paragraphs in
HTML-MST" thread.

GC> > With partials, this interpolation will
GC> > be done only when necessary inside interpolate_string() itself.
GC> > 
GC> > GC> If 'sample2xyz.policy' contains
GC> > GC>   NameOfPolicy = "group insurance certificate";
GC> > GC>   Footnote = "Read your {NameOfPolicy} carefully";
GC> > GC> then that's already as simple as possible and as powerful as 
necessary.
GC> > 
GC> >  It's confusing though, as you never know whether you string is going to 
be
GC> > interpolated or not and how many times.
GC> 
GC> Restating the example above because I failed to double the braces:
GC>   NameOfPolicy = "group insurance certificate";
GC>   Footnote = "Read your {{NameOfPolicy}} carefully";
GC> After substitution, that needs to become:
GC>   "Read your group insurance certificate carefully"
GC> in which case it must have been interpolated exactly once, leaving
GC> nothing further to interpolate. I don't see any confusion here.

 I thought that we would interpolate fields coming from the .policy files
only (and not the numeric fields computed on the fly, for example). I guess
there is no confusion if everything is always interpolated.

GC> Maybe you're thinking of recursive, multi-level interpolation:
GC> 
GC>   NameOfPolicy = "certificate";
GC>   PolicyNumber = "12345";
GC>   name_and_number = "{{NameOfPolicy}} number {{PolicyNumber}}";
GC>   Footnote = "Read your {{name_and_number}} carefully";
GC>   result: "Read your certificate number 12345 carefully"
GC> 
GC> The first non-comment line of interpolate_string() is
GC>     if(100 <= recursion_level)
GC> so it's already recursive.

 No, it isn't for plain fields. It's only recursive for partials.

GC> We've never cared about its actual recursion level in the past, so why
GC> should we care about it now?

 I'm afraid that there might be a fundamental misconception about the
difference between "fields" and "partials" in standard Mustache and the
current implementation, which I've failed to explain clearly, so let me do
this now:

 The difference between fields and partials is not just where is there
value coming from (the program for the former and a disk file for the
latter), but also in that the former values are used as is while the latter
ones are interpolated further.

 This is why I keep suggesting that we use partials instead of fields. And
if we do want to have recursive expansion at all, then the only alternative
to using partials is to make fields expansion recursive by default. This is
a possibility, of course, but this is _not_ how the current code works and
I don't like this idea.

GC> Or maybe you're thinking of using exactly two namespaces:
GC>   {{<file:FileToBeIncluded.mst}}
GC>   {{<all_other:NameOfSomeVariable}}
GC> in order to overload '{{<'.

 Yes, exactly.

GC> That would add confusion IMO: the example above
GC> would still be written the same way as always in an '.mst' file:
GC>   NameOfPolicy = "group insurance certificate";
GC>   Footnote = "Read your {{NameOfPolicy}} carefully";
GC>   result: "Read your group insurance certificate carefully"
GC> but if we moved that footnote into a '.policy' file, we'd have to change it:
GC>   Footnote = "Read your {{<policy:NameOfPolicy}} carefully";
GC>                           ^^^^^^^^ we'd need to add these characters
GC> It would be simpler not to change it.

 Actually we'd need to change {{Footnote}} to {{<policy:Footnote}}. The
Footnote contents itself would remain unchanged. But yes, this is indeed an
extra change.

GC> Maybe I'm missing something, because I perceive no advantage to using
GC> partials or namespaces in '.policy' files.

 I think you're saying this because you're under impression that recursive
expansion will always happen by default anyhow, which is not the case. And
so to me using partials is just an obvious way of requesting the recursive
expansion.

GC> >  Note that if we don't use partials, we'll have to interpolate all the 
data
GC> > coming out of the policy files. This is probably not the end of the world,
GC> > but it does feel a bit strange to do it.
GC> 
GC> That's what I don't understand. To me, it seems perfectly natural,
GC> obvious, and elegant.
GC> 
GC> Are you concerned about speed? Of course, I haven't measured it, but
GC> it seems likely to me that the first pass or two will interpolate
GC> just about everything, and if the "multi-level interpolation" example
GC> above requires one additional pass, that additional pass should be
GC> cheap, because there's only one "{{" left to find and deal with.

 I'm not concerned about speed (although recursive interpolation will, of
course, be slower than doing it non-recursively), but about simplicity. As
you surely don't intend to write 3 (or 4, or 5, ...) nested calls to
interpolate_string() in the code, you clearly want to/assume that we change
interpolation to be recursive by default and this will make things less
simple _conceptually_, let alone at the code level. E.g. currently
interpolating {{Foo}} is very simple. Interpolating {{<Bar}} is simple
enough too, but "<"[*] is there to show that it's not exactly the same
thing. With recursive interpolation, if Foo={{Bar}} and Bar={{Foo}},
interpolating {{Foo}} is not so simple at all any more because it results
in a run-time error due to exceeding recursion depth. As I think that vast
majority of expansions will continue to be simple, it just doesn't seem
right to me to make all of them recursive by default because we need a few
of them to be so. This is why GNU make has "=" and ":=" (and the latter is
preferred) and I think that this is why we should have simple "fields" and
recursively-expanded "partials" too.

 I hope I've managed to explain myself more or less clearly this time, but
please let me know if I still didn't.

 Thanks,
VZ

[*] Actually I managed to mix up Mustache syntax from the very beginning
    and it uses "{{>partial}}" and not "{{<partial}}". But I thought it
    would be less confusing to stick to the wrong symbol being used at
    this stage of the discussion -- just don't be surprised if you look
    at the actual source, which uses ">".

Attachment: pgpHv74bccmoa.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]