emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: html2text


From: Reiner Steib
Subject: Re: html2text
Date: Tue, 09 Nov 2004 23:44:24 +0100
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/21.3.50 (gnu/linux)

On Mon, Nov 08 2004, Reiner Steib wrote:

> [ The suggested patch from Jari's original message was: ]
>
> --8<---------------cut here---------------start------------->8---
> --- html2text.el.7.10 2004-11-06 17:20:46.000000000 +0200
> +++ html2text.el      2004-11-06 17:41:12.000000000 +0200
> @@ -42,8 +42,42 @@
>  (defvar html2text-format-single-element-list '(("hr" . html2text-clean-hr)))
>
>  (defvar html2text-replace-list
> -  '(("&nbsp;" . " ") ("&gt;" . ">") ("&lt;" . "<") ("&quot;" . "\"")
> -    ("&amp;" . "&") ("&apos;" . "'"))
> +  '(("&acute;" . "`")

This should be "´".

> +    ("&amp;" . "&")
> +    ("&apos;" . "'")
> +    ("&brvbar;" . "|")
> +    ("&cent;" . "c")
> +    ("&circ;" . "^")
> +    ("&copy;" . "(C)")
> +    ("&curren;" . "¤")
> +    ("&deg;" . "degree")
> +    ("&divide;" . "/")
> +    ("&euro;" . "e")
> +    ("&frac12;" . "½")
[...]

It seems strange to use Latin-1 characters for some entities, but not
for all encodable by Latin-1.

On a second thought, it looks like there are already more or less
complete lists[1] e.g. in `mm-url-html-entities' (from Gnus),
`sgml-char-names', `sgml-char-names-table', `iso-iso2sgml-trans-tab'
(Emacs) or `w3m-entity-alist' (emacs-w3m).

Probably one of these could be used.  Hm, maybe the function
`iso-sgml2iso' could be used in `html2text.el'?

Bye, Reiner.

[1] Might be checked with
    http://www.w3.org/TR/REC-html40/sgml/entities.html or other
    tables.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]