bug-recutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Additional "in" operator for fields being lists of strings


From: Marcin Szewczyk
Subject: Re: Additional "in" operator for fields being lists of strings
Date: Fri, 31 Jul 2020 11:35:25 +0200
User-agent: Mutt/1.10.1 (2018-07-13)

On Thu, Jul 30, 2020 at 10:43:59PM +0200, Jose E. Marchesi wrote:
> Marcin Szewczyk <marcin.szewczyk@wodny.org> wrote:
> > One question comes to mind. Should the != operator mean:
> > 1. at least one enum value different than  or
> > 2. none of enum tokens may be equal to the specified value.
> > [...]
> > Should a normalization step be taken?
> > Like:
> >
> >     Device: plumbus
> >     Tag: plubus
> >     Tag: dinglebop fleeb
> >     Tag: grumbo
> >
> > to (only for enum fields):
> >
> >     Device: plumbus
> >     Tag: plubus dinglebop fleeb grumbo
> 
> I would say we clearly want 2. for the semantics of != when applied to
> enumerated fields.  Normalizing is indeed necessary.
> 
> > Currently, I cannot see any exclusion operator. For multi-field strings
> > neither 'Y!="y3"' nor '!(Y="y3")' will exclude a record if there is any
> > Y field that matches these conditions. So the second semantic variant
> > would give something new and interesting but also incompatible with the
> > current string semantics. [...]
> 
> Hmm, I don't think don't need to keep the existing string semantics for
> enums, because in properly conformed data each Tag are restricted to
> have only one of the valid values, i.e.:
> 
> --- foo.rec ---
> %rec: Device
> %type: Tag enum dinglebop fleeb plubus grumbo
> 
> Device: plumbus
> Tag: dinglebop fleeb plubus grumbo
> --- end of foo.rec ---
> 
> $ recfix foo.rec
> foo.rec:5: error: invalid enum value.

But if the user has always used the properly structured variant
(accepted by recfix), ie.:

--- foo.rec ---
%rec: Device
%type: Tag enum dinglebop fleeb plubus grumbo

Device: plumbus
Tag: dinglebop
Tag: fleeb
Tag: plubus
Tag: grumbo
--- end of foo.rec ---

executing `recsel -e 'Tag != "fleeb"' foo.rec` would change output from
returning the record to returning nothing (assuming that both expanded
and non-expanded forms should mean the same thing).

Which form of using multiple enum values should be canonical:
- SFMV: single field with multiple values (non-expanded) or
- MFSV: multiple fields with single values (expanded)?

Implementing SFMV would probably be quite easy and based on strtok() in
the ops switch.

The MFSV form would probably require a serious change in `rec_sex_eval`
implementation[1] to give semantics 2. of the `!=` operator.

Do you think that normalization should be:
- explicit and applied permanently eg. by recfix or
- implicit and calculated just for SEX evaluation?

Or maybe a trick should be implemented:
- official representation of serialized records should be MFSV (explicit
  normalization) but
- for ease of implementation internal representation used for SEX
  evaluation should be SFMV (implicit normalization)?

One more thing that comes to mind is: should access to enum tokens
(multiple values per field) be implemented (if allowed) over
`rec_record_get_field_by_name` or should `rec_record_get_field_by_name`
itself be capable of indexing in the following implicit normalization
manner:

    Tag: plubus
    Tag: dinglebop fleeb
    Tag: grumbo

    Tag[0] = plubus
    Tag[1] = dinglebop
    Tag[2] = fleeb
    Tag[3] = grumbo



[1]: Not all combinations checked when more than one multiple-value field 
present
     https://lists.gnu.org/archive/html/bug-recutils/2020-07/msg00004.html

-- 
Marcin Szewczyk
http://wodny.org



reply via email to

[Prev in Thread] Current Thread [Next in Thread]