bug-gne
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnupedia] Peace In Our Time


From: Joakim Ziegler
Subject: Re: [Bug-gnupedia] Peace In Our Time
Date: Wed, 24 Jan 2001 22:55:40 -0600

On Wed, Jan 24, 2001 at 08:30:23PM -0700, Mike Warren wrote:
> "Dan Geiser" <address@hidden> writes:
>> David Tanzer wrote:

>>>I like the decision to use XML as a submission format.

>> I can't reiterate this enough but, I truly dislike this idea.
>> Assuming we honestly want to have an open submission policy the
>> easiest thing for me to keep in mind when following that policy
>> concept is "Can my Mom do this?".
 
> We can always make conversion programs, which is (I suspect) what
> David meant. Submitting in plain text and passing it to a Python
> script to massage into XML/TEI shouldn't be too much work.

Adding detailed structural markup to plain text automatically is almost
impossible to do programmatically (I'd say it is impossible to achieve a zero
percent error rate). There are simply too many ambiguities to be resolved
automatically, without a huge set of heuristics. I point you to the page "A
Short Example" in the TEI consortium's tutorial on TEI for an enlightening
example of this (in fact, they specifically state that TEI should be used
since plain text is impossible for a computer to read properly):

http://www.tei-c.org/Lite/U5-eg.html

So it's most likely necessary to make a tradeoff here between the quality of
information, and the threshold of competence required to submit articles. Of
course, there's nothing keeping us from receiving articles in plain text, and
marking them up by hand, given that we have volunteers to do this. It's
certainly not too much work, with decent tools. But it can't be done
automatically.

-- 
Joakim Ziegler - Ximian web monkey - address@hidden - address@hidden
  FIX sysop - free software coder - FIDEL & Conglomerate developer
         http://www.avmaria.com/ - http://www.ximian.com/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]