[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: gutenberg-coding.el -- coding system for Project Gutenberg files
From: |
Kevin Ryde |
Subject: |
Re: gutenberg-coding.el -- coding system for Project Gutenberg files |
Date: |
Fri, 28 Oct 2005 09:57:59 +1000 |
User-agent: |
Gnus/5.110004 (No Gnus v0.4) Emacs/21.4 (gnu/linux) |
"Richard M. Stallman" <address@hidden> writes:
>
> We could do this only for files called .txt, I suppose.
> That would not eliminate false matches, but would limit them.
I realized that the only files needing to be matched are the non-ascii
ones with a charset spec. Doh. So I think the test can be for a file
starting with one of
"Project Gutenberg "
"Project Gutenberg's "
"The Project Gutenberg "
"**This is a COPYRIGHTED Project Gutenberg "
and possibly with bytes
0xEF 0xBB 0xBF
before those, which is in some (but not all) utf-8 files. This is
tighter than just "Project Gutenberg" anywhere in the first line. New
diff below.
I put just "..." to match the three marker bytes. I'd like to put
those exactly, but it will be matched against a unibyte buffer (if I'm
not mistaken), and I'm unsure how to give a unibyte string literal, or
a multibyte which will match correctly. (A couple of things I tried
didn't work.)
mule.el.gutenberg-2.diff
Description: Text document
Re: gutenberg-coding.el -- coding system for Project Gutenberg files, Richard M. Stallman, 2005/10/25