[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lmi] Unknown fields in table input text files
From: |
Greg Chicares |
Subject: |
Re: [lmi] Unknown fields in table input text files |
Date: |
Sat, 20 Feb 2016 04:12:25 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Icedove/38.5.0 |
On 2016-02-20 03:16, Vadim Zeitlin wrote:
>
> I decided to extend my tests checking that all tables in qx_ins and qx_cso
> databases survive the round trip through the new table code to also do the
> same for the tables in qx_ann and got several failures due to the presence
> of unknown "fields" in some of the tables here.
If it's not too difficult, could you share that with me, so that I can
run the same test against all proprietary tables? It doesn't have to be
polished; I would only need to run it once.
> One of them looks like a real field as it's present in several files: it's
> the "Editor: " one. I don't know at all what to do about it as there is no
> corresponding field in the binary format, so there doesn't seem to be any
> way to store the value of this field in it.
Please tell me the number of a 'qx_ann' table that has this field so that
I can examine it. I don't remember ever seeing "Editor:" in these files.
> Another one is not a field at all, but just something looks like one: a
> couple of tables have lines starting with "WARNING: " in their
> "Construction method" description. I'm not sure what to do about this one
> neither: should I specifically make an exception for this word? Or ignore
> any unknown "fields"? The latter seems dangerous, as typos in the field
> names could go unnoticed. Ideal would be to have some way to escape the
> colon, e.g. by doubling it, but even if I introduced support for this in
> the new code, it still wouldn't be able to deal with the text files
> produced by the old version.
I have two suggestions:
(1) Build a whitelist of header names, and reject anything not on the list.
I imagine that this list will be short; I thought they were enumerated
in the 1990s code, and perhaps also in the HLP or GID documentation.
(2) Use a regex like /[A-Za-z0-9]* *[A-Za-z0-9]*:/ on the assumption that
header names consist of one or two words followed by a colon. Deem any
colon that occurs later in the line to be content rather than markup.
- [lmi] Unknown fields in table input text files, Vadim Zeitlin, 2016/02/19
- Re: [lmi] Unknown fields in table input text files,
Greg Chicares <=
- Re: [lmi] Unknown fields in table input text files, Vadim Zeitlin, 2016/02/20
- Re: [lmi] Unknown fields in table input text files, Greg Chicares, 2016/02/20
- Re: [lmi] Unknown fields in table input text files, Vadim Zeitlin, 2016/02/20
- Re: [lmi] Unknown fields in table input text files, Greg Chicares, 2016/02/21
- Re: [lmi] Unknown fields in table input text files, Vadim Zeitlin, 2016/02/21
- Re: [lmi] Unknown fields in table input text files, Greg Chicares, 2016/02/21
- Re: [lmi] Unknown fields in table input text files, Vadim Zeitlin, 2016/02/21
- Re: [lmi] Unknown fields in table input text files, Greg Chicares, 2016/02/21
- Re: [lmi] Unknown fields in table input text files, Vadim Zeitlin, 2016/02/21
- Re: [lmi] Unknown fields in table input text files, Greg Chicares, 2016/02/21