octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: XML tools for Octave


From: Andy Adler
Subject: Re: XML tools for Octave
Date: Thu, 29 Jun 2006 15:20:48 -0400 (EDT)

On Thu, 29 Jun 2006, John W. Eaton wrote:

On 29-Jun-2006, Andy Adler wrote:

| So I don't think that pushing all the complexity to the user is right.
| Somehow it should be easy to do easy things, but possible to do
| correct things.

I really don't know much about XML, but maybe XML is just the wrong
solution (generally, not just for Octave)?  But I guess it is probably
too late for that discussion as so many jumped for XML because it had
the right buzz to it and now there are many XML things that we would
like to be able to handle with Octave.

I did a tongue-in-cheek talk about this in 2003 (for Perl)
http://www.site.uottawa.ca/~adler/talks/2003/YAPC-CA-2003-XML-talk.pdf

For octave to be a good tool for munging data, we need to be
able to load data from lots of different sources. The fact is that
today much data is in XML. Whether is was a good idea to put it
in XML is irrelevant.

OTOH, I certainly personally prefer trying to hack data out of an
XML file than out of a proprietary binary format.

Or, if XML is the right solution, then can you (briefly, one or two
sentences) explain why it is the right solution, and also why it seems
to be so difficult to use correctly?

My best quote about what's good about XML is (from slashdot):

  Re:But XML is great for computers... (Score:5, Insightful) by Ed Avis (5917)
  <address@hidden> on Tuesday March 18, @08:48AM (#5535893)
  >You mean like most other non-xml config files in /etc, like say hosts, DNS
  >zone files, named.conf, passwd/shadow, hosts.allow/deny, sendmail.mc or
  >resolv.conf (etc. etc.)? These have standard layouts, text-based, can be
  >edited by hand and can be easily parsed.

  You just gave the best argument for adopting XML as widely as possible.
  Yes, all these can be parsed (with the possible exception of sendmail's
  config files which may be Turing-complete) but they all require *different*
  code for each config file. If they were in XML you'd still need different
  semantic code, of course, but a whole wodge of syntax issues (how do I
  quote strings, how do I escape newlines, how do I mark nested scopes,
  what happens when the string delimiter character occurs inside a string,
  how do I deal with comments, what is the character set, is there a formal
  grammar for the document, etc etc) would be dealt with. . But they
  would be dealt with *once*. No need to learn a new or almost-the-samebut-
  slightly-different set of syntactic conventions for every single config file.

XML is difficult because:
  - it was invented to markup text, not data. This forced lots of syntactic
    choices that do not map to programming languages

  - XML looks easy, but has many subleties. Thus many people write hand
    parsers that mostly, not completely, work.

  - Most raw parser API's (DOM, SAX) suck. I mean that they are way to
    complicated and too low level.

  - Many XML based Standards are bloated and buggy. For example SOAP, the
    XML standard for RPI in .NET *requires* invalid XML.

--
Andy Adler <address@hidden> 1(613)562-5800x6218



reply via email to

[Prev in Thread] Current Thread [Next in Thread]