[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Emacs i18n
From: |
Mattias Engdegård |
Subject: |
Re: Emacs i18n |
Date: |
Thu, 28 Mar 2019 12:03:26 +0100 |
27 mars 2019 kl. 22.22 skrev Juri Linkov <address@hidden>:
>
> I tried ‘regexp-opt’ and it generates a ready-to-use regexp:
>
> (replace-regexp-in-string
> "%d" "\\\\([0-9]+\\\\)"
> (regexp-opt '("finished with %d match found"
> "finished with %d matches found"
> "finished with no matches found")))
>
> ⇒ "\\(?:finished with \\(?:\\(?:\\([0-9]+\\) match\\(?:es\\)?\\|no
> matches\\) found\\)\\)"
Well now. There is no guarantee that regexp-opt won't split the %d. Format
strings must be parsed left-to-right for correctness¹. I'm still skeptical, but
if you really want to give this a try, then first segment the format string:
"Today %d little piggies built %03o houses and said '%s'."
"Today %d little piggy built %o house and said '%s'."
=>
("Today " ?d " little piggies built " ?o " houses and said '" ?s "'.")
("Today " ?d " little piggy built " ?o " house and said '" ?s "'.")
leaving the format placeholders as atomic entities (here shown as characters,
but you may need more information there).
Then run your fav diff algo on the result. Most important to performance is
prefix merging; anything else is just to make the regexp smaller.
Here, prefix and suffix merging would leave you with (still in abstract form)
("Today " ?d " little pigg"
(("ies built " ?o " houses")
("y built " ?o " house"))
" and said '" ?s "'.")
From there you can either recursively try to find more common subsequences, or
call it a day and render it into a regexp:
"Today -?[0-9]+ little pigg\\(?:ies built -?[0-7]+ houses\\|y built -?[0-7]+
house\\) and said '\\(?:.\\|\n\\)*'."
All this will need to be done at run-time, since it is run on translated
strings.
¹ To match format parameters, try something like
(rx "%"
(opt (1+ digit) "$")
(0+ digit)
(opt "." (0+ digit))
(any "%sdioxXefgcS"))
- Re: Emacs i18n, (continued)
- Re: Emacs i18n, Eli Zaretskii, 2019/03/24
- Re: Emacs i18n, Jean-Christophe Helary, 2019/03/25
- Re: Emacs i18n, Juri Linkov, 2019/03/25
- Re: Emacs i18n, Eli Zaretskii, 2019/03/25
- Re: Emacs i18n, Richard Stallman, 2019/03/27
- Re: Emacs i18n, Mattias Engdegård, 2019/03/25
- Re: Emacs i18n, Eli Zaretskii, 2019/03/25
- Re: Emacs i18n, Juri Linkov, 2019/03/25
- Re: Emacs i18n, Mattias Engdegård, 2019/03/25
- Re: Emacs i18n, Juri Linkov, 2019/03/27
- Re: Emacs i18n,
Mattias Engdegård <=
Re: Emacs i18n (was: bug#34520: delete-matching-lines should report how many lines it deleted), Richard Stallman, 2019/03/03
- Re: Emacs i18n (was: bug#34520: delete-matching-lines should report how many lines it deleted), Eli Zaretskii, 2019/03/04
- Re: Emacs i18n (was: bug#34520: delete-matching-lines should report how many lines it deleted), Paul Eggert, 2019/03/04
- Re: Emacs i18n (was: bug#34520: delete-matching-lines should report how many lines it deleted), Eli Zaretskii, 2019/03/04
- Re: Emacs i18n (was: bug#34520: delete-matching-lines should report how many lines it deleted), Paul Eggert, 2019/03/04
- Re: Emacs i18n, Juri Linkov, 2019/03/05
- Re: Emacs i18n, Richard Stallman, 2019/03/05
- Re: Emacs i18n, Eli Zaretskii, 2019/03/06
- Re: Emacs i18n, Paul Eggert, 2019/03/06
- Re: Emacs i18n, Eli Zaretskii, 2019/03/06