emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [O] "Smart" quotes


From: Mark E. Shoulson
Subject: Re: [O] "Smart" quotes
Date: Tue, 29 May 2012 20:51:39 -0400
User-agent: Mozilla/5.0 (X11; Linux i686; rv:12.0) Gecko/20120430 Thunderbird/12.0.1

On 05/29/2012 01:57 PM, Nicolas Goaziou wrote:
Hello,

"Mark E. Shoulson"<address@hidden>  writes:


I guess it doesn't actually matter, but it starts to get weird if you
find yourself looking arbitrarily far back, and then you start
building in exceptions for crossing paragraph boundaries...
True. I had the exporter in mind, where you always start at the
beginning of the paragraph. It would be more difficult with search
starting in the middle of the paragraph.

Maybe the on-screen stuff is no harder; will just have to see.

And then there's the fact that multi-paragraph quotes usually have an
open-quote for each paragraph but only one close-quote at the end...
Some french typographers suggest to use a close-quote at the beginning
of the paragraph to avoid that confusion, or to simply drop them (since
they are a pain to maintain anyway). I don't know about other languages
but, if that's the same, is it a good idea to bother implementing it?

I've never heard of it. But I think we may be overthinking this; we can drive ourselves crazy trying to compress a dozen different typographical traditions (and informal customs) into a few Elisp rules. On the other hand, I don't think we need to throw up our hands and give up either! :)

Actually keeping count of what level you're at, accurately, is
a classic example of a non-regular language; you need a push-down
automaton to keep count, and regular expressions don't cut it.
This is limited to 2 levels.
True.
I'm rambling.  In sum, I'm going to start off /not/ trying to solve
that problem, and assume the writer is going to use alternating " and
as typography requires and not try to second-guess what level we're
at.
You are right, the problem will be easier to solve with both " and '.

Though, "as typography requires" is not true. In France, the /Imprimerie
Nationale/ suggests to use guillemots at both levels. Remember that
typography is localized, which is the main difficulty of the
implementation.

Also a good point.

All right, bottom line, this is sort of what I'm seeing. I'm not 100% sure which files should house these things, but something like this:

1) a variable containing for each language regexp for each of: open double-quote, close double-quote, open single-quote, close single-quote, and maybe mid-word apostrophe. Odds are these regexps are going to be the same for just about all languages (the regexps detecting them, mind you), so probably should have some sort of default that the alist can just reference. A language should also be allowed to define other quote regexps in its list too. We need these to be ordered, with a standard set, so that we can have...

2) for each *exporter* (including on-screen display), a variable that defines, for each language, what the *substitution* will be for open-double-quote, close-double-quote, etc. Other extras can be defined too. That way we can have an exporter-independent way to detect quotes to be smartified, but each exporter has its own way to smartify them.

3) Since most exporters are probably going to be handling doing the process approximately the same (match the regexp, stick in the associated substitution), org-export.el should have a generic function that does this which each exporter *may* call in (or as) its quote-smartifier in its text translator, unless it needs something more specific which it can provide itself.

In terms of what is handled, the idea in my head is that we would expect the writer to be using " or ' to surround their quotes, regardless of what their native custom is (if they're doing it using their language-specific quote-marks, we don't need to bother with all this anyway). Goal is to handle either "quotes" or 'quotes' in either nesting (or no nesting, if someone does "quote' for some reason), and with any luck not get too confused with other uses of apostrophe.

It makes sense to me, but I bet I explained it badly and people are going to have all kinds of issues with it. :)

No telling when (if?) I'll be able to produce something along these lines, but it's something to start thinking about anyway.

~mark



reply via email to

[Prev in Thread] Current Thread [Next in Thread]