bug-recutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-recutils] Pending stuff to add to the manual


From: Julio Matus
Subject: Re: [bug-recutils] Pending stuff to add to the manual
Date: Mon, 12 Nov 2012 23:26:29 +0900
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux)

I understand your point. And would probably normally agree with you,
except that as you were taught at school "data are", I've heard "data
is" all my life. I don't find it fair calling it a mistake, as it's in
the dictionary and all:
http://oxforddictionaries.com/definition/english/data?q=data (quote from
here)
> . In modern non-scientific use, however, it is generally not treated as a 
> plural. Instead, it is treated as a mass noun, similar to a word like 
> information, which takes a singular verb. Sentences such as
> data was collected over a number of years
> are now widely accepted in standard English.

It's just modern English (not "wrong" English). As wikipedia points out,
"Data is most often used as a singular mass noun in educated everyday
usage", and is used like that in newspapers too.
I'm sorry we disagree in this point.

> I'll leave it to the recutils maintainer to make his decision.
Agreed. More feedback from other developers would also be helpful.

Regards,
--
  Julio

John Darrington <address@hidden> writes:

> Language is a dynamic thing of course, and there are countless instances where
> formerly incorrect usage has since become canoicalised.  To me however, "data 
> is"
> sounds just wrong, and when I read it, breaks  my concentration.  Like you 
> say,
> however it is a common mistake and may well be on the way to becoming 
> "standard"
> English.  I'm a conservative however and prefer what I was taught at school 
> unless
> I see a good reason otherwise.
>
> I'll leave it to the recutils maintainer to make his decision.
>
> J'
>
> A similar mistake which has almost become canonicalised in computer manuals 
> is 
> the word "informations".  "Information", of course is a collective noun, and 
> therefore has no plural, but one often reads "these informations are used 
> ...",
> - logical for a German - but hurts the ears of those who had to grow up on 
> BBC 
> English.
>
>
>
> On Mon, Nov 12, 2012 at 08:34:04PM +0900, Julio Matus wrote:
>      
>      Hello John,
>      
>        Thank you for taking the time to read my patch, and giving me your
>      opinion.
>      I respect you very much as a hacker, and native English speaker, but I'm
>      afraid I'll have to disagree with you in this one...
>      
>      For once, "data is" returns about 201,000,000 results, and "data are"
>      56,100,000, in google search at least. So I guess "data is" is more
>      widely used.
>      Although I do understand your point of "data" being the plural form of
>      "datum" in Latin, and traditional English. For me, and I guess at least
>      around 4/5 of the people reading the documentation (from the search
>      results above), "data is" sounds more natural.
>      
>      There's some discussion about this on the wikipedia page, and this 
> English
>      forum for example:
>      
> http://www.englishforums.com/English/WhatCorrectDataEnteredDataEntered/kvmdv/post.htm
>      Apparently both are acceptable, but "data is" is preferred when talking
>      about computer related data.
>      
>      So, as long as the documentation isn't written in a very strict
>      scientific fashion, I'll have to vote for "data is", as I think it fits
>      this documentation better, and sounds more natural in standard English.
>      
>      I'm sorry I can't agree with you on this one, but more discussion, or
>      other comments are most welcome.
>      --
>        Julio
>      
>      John Darrington <address@hidden> writes:
>      
>      > On Mon, Nov 12, 2012 at 12:49:24AM +0900, Julio Matus wrote:
>      >      
>      >      Hello Jose,
>      >      
>      >        I'll try to give you some help with these issues.
>      >      For the time being, I'm attaching a patch with some minor English
>      >      grammar / rephrasing changes to the current documentation.
>      >
>      > Sorry to be awkward, but this patch would actually cause the English 
> grammar
>      > to be incorrect:
>      >
>      >      address@hidden The stored data are definitely not directly 
> writable by humans.
>      >      address@hidden The stored data is not directly human readable.
>      >
>      > because "data" is the plural of "datum", and the conjugation of the 
> verb "to be"
>      > in the 3rd person plural is "are".    So "the data are" is correct, 
> "the data is" is not.
>      > (Think: one would *not* say "the words is not directly readable")
>      >
>      > Regards,
>      >
>      > John
>      >
>      >      
>      >
>      >      >From 23359bb28b8123106a2ce95e49e5e0cd304b7b16 Mon Sep 17 
> 00:00:00 2001
>      >      From: Julio Claudio Matus Ramirez <address@hidden>
>      >      Date: Mon, 12 Nov 2012 00:27:02 +0900
>      >      Subject: [PATCH] English grammar and more natural sentences
>      > suggestions (till "Scalar types" description node)
>      >      
>      >      ---
>      >       doc/recutils.texi | 122
>      > +++++++++++++++++++++++++++--------------------------
>      >       1 files changed, 62 insertions(+), 60 deletions(-)
>      >      
>      >      diff --git a/doc/recutils.texi b/doc/recutils.texi
>      >      index f13d95e..d961f7d 100644
>      >      --- a/doc/recutils.texi
>      >      +++ b/doc/recutils.texi
>      >      @@ -96,7 +96,7 @@ Indexes
>      >       @chapter Introduction
>      >       
>      >       GNU recutils is a set of tools and libraries to access 
> human-editable,
>      >      -text-based databases called @emph{recfiles}.  The data are 
> stored as a
>      >      +text-based databases called @emph{recfiles}.  The data is stored 
> as a
>      >       sequence of records, each record containing an arbitrary number 
> of
>      >       named fields.  Advanced capabilities usually found in other data
>      >       storage systems are supported: data types, data integrity (keys,
>      >      @@ -111,9 +111,9 @@ requirements.  Big systems having complex
>      > data storage requirements
>      >       will probably make use of some full-fledged relational system 
> such as
>      >       MySQL or address@hidden  Less demanding applications, or 
> applications
>      >       with special deployment requirements, may find it more 
> convenient to
>      >      -use a simpler system such as SQLite, where the data are stored 
> in a
>      >      +use a simpler system such as SQLite, where the data is stored in 
> a
>      >       single binary file.  XML files are often used to store 
> configuration
>      >      -settings for programs, and to encode data to be transmitted 
> through
>      >      +settings for programs, and to encode data for transmission 
> through
>      >       networks.
>      >       
>      >       So it looks like all the needs are covered by the existing
>      >      @@ -121,8 +121,8 @@ solutions @dots{} but consider the following
>      > characteristics of the
>      >       data storage systems mentioned in the previous paragraph:
>      >       
>      >       @table @minus
>      >      address@hidden The stored data are not directly readable by 
> humans.
>      >      address@hidden The stored data are definitely not directly 
> writable by humans.
>      >      address@hidden The stored data is not directly human readable.
>      >      address@hidden The stored data is definitely not directly 
> writable by humans.
>      >       @item They are program dependent.
>      >       @item They are not easily managed by version control systems.
>      >       @end table
>      >      @@ -138,10 +138,10 @@ readable than address@hidden  The problem 
> with YAML
>      > is that it was designed as a
>      >       usually found in programming languages.  That makes it too 
> complex for
>      >       the simple task of storing plain lists of items.
>      >       
>      >      -Recfiles are human-readable, human-writable and still they are 
> easy to
>      >      +Recfiles are human-readable, human-writable and still easy to
>      >       parse and to manipulate automatically.  Obviously they are not
>      >       suitable for any task (for example, it can be difficult to manage
>      >      -hierarchies in recfiles) and performance is somewhat sacrificed 
> in
>      >      +hierarchies in recfiles) and performance is somewhat sacrified in
>      >       favor of readability.  But they are quite handy to store small to
>      >       medium simple databases.
>      >       
>      >      @@ -380,9 +380,9 @@ Age: 969
>      >       Any line having an @code{#} (ASCII 0x23) character in the first 
> column
>      >       is a comment line.
>      >       
>      >      -Comment may be used to insert information that
>      >      -is not part of the database but useful otherwise.
>      >      -They are completely ignored by processing tools and can only
>      > ever be seen by
>      >      +Comments may be used to insert information that
>      >      +is not part of the database but useful in other ways.
>      >      +They are completely ignored by processing tools and can only be 
> seen by
>      >       looking at the recfile itself. 
>      >       
>      >       It is also quite convenient to comment-out information from the
>      >      @@ -418,7 +418,7 @@ kind of markers:
>      >       
>      >       Unlike some file formats, comments in recfiles must be complete 
> lines.
>      >       You cannot start a comment in the middle of a line.
>      >      -For example, in the following, the @code{#} does @emph{not}
>      > start a comment:
>      >      +For example, in the following record, the @code{#} does
>      > @emph{not} start a comment:
>      >       @example
>      >       Name: Peter the Great # Russian Tsar
>      >       Age: 53   
>      >      @@ -430,7 +430,7 @@ Age: 53
>      >       @cindex descriptor
>      >       Certain properties of a set of records can be specified by 
> preceding
>      >       them with a @dfn{record descriptor}.  A record descriptor is 
> itself a
>      >      -record, and uses fields with some predefined names to store the
>      >      +record, and uses fields with some predefined names to store
>      >       properties.  The most basic property that can be specified for a 
> set
>      >       of records is their @dfn{type}.  The special field name 
> @code{%rec} is
>      >       used for that purpose:
>      >      @@ -454,10 +454,10 @@ The effect of a record descriptor ends when
>      > another descriptor is
>      >       found in the stream of records.  This allows you to store 
> different kinds
>      >       of records in the same database.  For example, consider you have 
> to
>      >       maintain a depot.  You will need to keep records of both the 
> current
>      >      -stockage and the movements.
>      >      +articles and their stock.
>      >       
>      >       The following example shows the usage of two record descriptors 
> to
>      >      -store both kind of records: articles and movements.
>      >      +store both kind of records: articles and stock.
>      >       
>      >       @example
>      >       %rec: Article
>      >      @@ -468,14 +468,14 @@ Title: Article 1
>      >       Id: 2
>      >       Title: Article 2
>      >       
>      >      -%rec: Movement
>      >      +%rec: Stock
>      >       
>      >       Id: 1
>      >       Type: sell
>      >       Date: 20 April 2011
>      >       
>      >       Id: 2
>      >      -Type: acquisition
>      >      +Type: stock
>      >       Date: 21 April 2011
>      >       @end example
>      >       
>      >      @@ -483,12 +483,12 @@ Date: 21 April 2011
>      >       @cindex special fields
>      >       @cindex key, primary key
>      >       @cindex primary key
>      >      -Besides determining the type of the records that follows in the
>      >      +Besides determining the type of record that follows in the
>      >       stream, record descriptors can be used to describe other 
> properties of
>      >      -those records.  That can be done by using the so-called 
> @dfn{special
>      >      -fields}, having special names from a predefined set.  Consider 
> for
>      >      -example the following database, where the descriptor is used to
>      >      -specify a primary key and a mandatory field:
>      >      +those records.  This can be done by using @dfn{special
>      >      +fields}, which have special names from a predefined set.  
>      >      +Consider for example the following database, where record 
> descriptors
>      >      +are used to specify a primary key and a mandatory field:
>      >       
>      >       @cindex @code{%mandatory}
>      >       @cindex mandatory fields
>      >      @@ -559,7 +559,7 @@ Title: Loan
>      >       @end example
>      >       
>      >       @noindent
>      >      -Only one @code{%rec} field shall appear in a record descriptor.  
> If
>      >      +Only one @code{%rec} field should be in a record descriptor.  If
>      >       there are more it is an integrity violation.  It is highly
>      >       recommended (but not enforced) to place this field in the first
>      >       position of the record descriptor.
>      >      @@ -634,7 +634,7 @@ schema supported by @code{libcurl} will work.
>      >       @cindex restricting fields from records
>      >       @cindex field, forbidden fields
>      >       @cindex prohibited fields
>      >      -Those special field names are used to restrict the fields that 
> can
>      >      +These special field names are used to restrict the fields that 
> can
>      >       appear in the records stored in a database.  Their usage is:
>      >       
>      >       @example
>      >      @@ -643,12 +643,12 @@ appear in the records stored in a database.
>      > Their usage is:
>      >       @end example
>      >       
>      >       @noindent
>      >      -In both cases the list of field names are separated by one or 
> more
>      >      +In both cases the lists of field names are separated by one or 
> more
>      >       blank characters.
>      >       
>      >       @cindex field, compulsory fields
>      >       @cindex field, mandatory fields
>      >      -The fields listed in some @code{%mandatory} entry are
>      >      +The fields listed in a @code{%mandatory} entry are
>      >       mandatory; @ie{}, at least one field with this name shall be 
> present
>      >       in any record of this kind.
>      >       @cindex integrity problems
>      >      @@ -659,10 +659,10 @@ a data integrity failure.
>      >       Consider for example an ``address book'' database where each 
> record
>      >       stores the information associated with a contact.  The records 
> will be
>      >       heterogeneous, in the sense they won't feature exactly the same
>      >      -fields: the contact of an internet shop will probably have an
>      >      address@hidden field, while the entry for our grandmother 
> probably won't.
>      >      -We still want to make sure that every entry has at a field: the 
> name
>      >      -of the contact.  In that case we could use @code{%mandatory} as
>      >      +fields: the contact of an internet shop will probably have a
>      >      address@hidden field, while the entry for our grandmother 
> probably won't.
>      >      +We still want to make sure that every entry has a field with the 
> name
>      >      +of the contact.  In this case, we could use @code{%mandatory} as
>      >       follows:
>      >       
>      >       @example
>      >      @@ -678,8 +678,8 @@ Phone: +98 43434433
>      >       @end example
>      >       
>      >       @noindent
>      >      -Similarly, the fields listed in some @code{%prohibit} entry are
>      >      -forbidden; @ie{}, no field with this name shall be present
>      >      +Similarly, the fields listed in a @code{%prohibit} entry are
>      >      +forbidden; @ie{}, no field with this name should be present
>      >       in any record of this kind.  Again, records violating this 
> restriction
>      >       are invalid.
>      >       
>      >      @@ -721,16 +721,16 @@ usage is:
>      >       @end example
>      >       
>      >       @noindent
>      >      -The list of field names are separated by one or more blank 
> characters.
>      >      +The lists of field names are separated by one or more blank 
> characters.
>      >       
>      >       @cindex unique fields
>      >      -The @code{%unique} special field allows one to declare fields as 
> unique,
>      >      +The @code{%unique} special field allows us to declare fields as 
> unique,
>      >       meaning there cannot exist more than one field with the same 
> name in a
>      >       single record.
>      >       
>      >       For example, an entry in an address book database could contain 
> an
>      >       @code{Age} field.  It does not make sense for a single person to 
> be of
>      >      -several ages, so that field could be declared as ``unique'' in 
> the
>      >      +several ages. So, a field could be declared as ``unique'' in the
>      >       corresponding record descriptor as follows:
>      >       
>      >       @example
>      >      @@ -744,13 +744,13 @@ Several @code{%unique} fields can appear in
>      > the same record
>      >       descriptor.  The set of unique fields is the union of all the 
> entries.
>      >       
>      >       @code{%key} makes the referred field the primary key of the 
> record
>      >      -set.  Its effect is that any field with that name must be both 
> unique
>      >      -and mandatory, and additionally the values of those fields shall 
> be
>      >      +set.  As effect, any field with that name must be both unique
>      >      +and mandatory, and additionally, the values of those fields 
> shall be
>      >       unique in the context of the record set.  This closely 
> corresponds to
>      >       the notion of ``primary key'' usually implemented in relational
>      >       systems.
>      >       
>      >      -Consider for example a database of items in a stockage.  Each 
> item is
>      >      +Consider for example a database of items in stock.  Each item is
>      >       identified by a numerical @code{Id} field.  No item may have 
> more than
>      >       one @code{Id}, and no items may exist without an associated
>      >       @code{Id}.  Additionally, no two items may share the same 
> @code{Id}.
>      >      @@ -770,12 +770,12 @@ Title: Sticker big
>      >       @end example
>      >       
>      >       @noindent
>      >      -It would not make sense to have several primary keys in a record 
> set,
>      >      -and thus it is not allowed to have several @code{%key} fields in 
> the
>      >      +It would not make sense to have several primary keys in a record 
> set.
>      >      +Thus, it is not allowed to have several @code{%key} fields in the
>      >       same record descriptor.  
>      >       @cindex integrity problems
>      >      -That is a data integrity
>      >      -violation and will be reported by a checking tool.
>      >      +This would be a data integrity
>      >      +violation, and will be reported by a checking tool.
>      >       
>      >       @node %doc
>      >       @section %doc
>      >      @@ -783,14 +783,14 @@ violation and will be reported by a 
> checking tool.
>      >       @cindex @code{%doc}
>      >       @cindex documentation fields
>      >       This field contains documentation about the record.   It is 
> similar to a
>      >      -comment (@pxref{Comments}), but this field can be managed in a
>      > programmatic
>      >      -way easier.
>      >      +comment (@pxref{Comments}), but it can be managed easier 
>      >      +in a programmatic way.
>      >       
>      >       Unlike a comment, @code{%doc} fields are recognized by tools 
> such as
>      >      address@hidden (@pxref{recinf}) which process record descriptors.
>      >      address@hidden (@pxref{recinf}) which processes record 
> descriptors.
>      >       It is a good idea to use the @code{%doc} field to provide a 
> description
>      >      -of the records; typically a description more verbose than the
>      > name provided
>      >      -by the  @code{%rec} field.
>      >      +of the records; typically a description more verbose than the
>      > name provided
>      >      +by the @code{%rec} field.
>      >       For example, you might have two record sets with @code{%rec} and
>      > @code{%doc}
>      >       fields as follows:
>      >       
>      >      @@ -837,7 +837,7 @@ person. @code{Name} will never use several
>      > lines. @code{Age} will
>      >       typically be in the range @code{0..120}, and there are only a few
>      >       valid values for @code{MaritalStatus}: single, married and widow.
>      >       Phones may be restricted to some standard format as well to be 
> valid.
>      >      -All those restrictions (and many others) can be enforced by using
>      >      +All these restrictions (and many others) can be enforced by using
>      >       @dfn{field types}.
>      >       
>      >       There are two kind of field types: @dfn{anonymous} and
>      > @dfn{named}.  Those are
>      >      @@ -888,7 +888,7 @@ it is a good idea to consistently follow some
>      > convention to help
>      >       distinguishing type names from field names.  For example, the
>      >       @code{_t} suffix could be used for types.
>      >       
>      >      -A type can be declared to be a synonym of another type.  The 
> syntax
>      >      +A type can be declared to be an alias for another type.  The 
> syntax
>      >       is:
>      >       
>      >       @example
>      >      @@ -907,8 +907,9 @@ descriptions.  For example, consider the
>      > following example:
>      >       @end example
>      >       
>      >       @noindent
>      >      -Both @code{Item_t} and @code{Transaction_t} are synonyms for the 
> type
>      >      address@hidden  They all are numeric identifiers.
>      >      +Both @code{Item_t} and @code{Transaction_t} are aliases for the 
> type
>      >      address@hidden Which is in place an alias for the type @code{int}.
>      >      + So, they are both numeric identifiers.
>      >       
>      >       The order of the @code{%typedef} fields is not relevant.  In
>      >       particular, a type definition can reference other type that is 
> defined
>      >      @@ -922,10 +923,10 @@ below.  The previous example could have
>      > been written as:
>      >       
>      >       @noindent
>      >       @cindex integrity problems
>      >      -Integrity checks will complain if undefined types are 
> referenced, and
>      >      -if there are loops (direct or indirect) in type declarations.  
> For
>      >      -example, the following set of declarations contains a loop and 
> are
>      >      -thus invalid:
>      >      +Integrity check will complain if undefined types are
>      > referenced. As well as when any aliases up referencing back (looping
>      > back
>      >      +directly or indirectly) in type declarations.  For
>      >      +example, the following set of declarations contains a loop.
>      >      +Thus, it's invalid:
>      >       
>      >       @example
>      >       %typedef: A_t B_t
>      >      @@ -981,7 +982,7 @@ without having to use a @code{%typedef} in
>      > the following way:
>      >       @subsection Scalar types
>      >       
>      >       The rec format supports the declaration of fields of the 
> following
>      >      -scalar types: integer numbers, ranges and reals.
>      >      +scalar types: integer numbers, ranges and real numbers.
>      >       
>      >       @cindex integers
>      >       Signed @dfn{integers} are supported by using the @code{int}
>      >      @@ -994,9 +995,9 @@ declaration:
>      >       @cindex hexadecimal
>      >       @cindex octal
>      >       @noindent
>      >      -Given that declaration, Fields of type @code{Id_t} must contain
>      >      -integers, that may be negative.  Hexadecimal values can be 
> written
>      >      -using the @code{0x} prefix, and octal values use an extra
>      >      +Given the declaration above, fields of type @code{Id_t} must
>      >      +contain integers, and they may be negative.  Hexadecimal values
>      > can be written
>      >      +using the @code{0x} prefix, and octal values using an extra
>      >       @code{0}. Valid examples are:
>      >       
>      >       @example
>      >      @@ -1011,7 +1012,7 @@ Id: 020
>      >       @cindex ranges
>      >       @noindent
>      >       Sometimes it is desirable to reduce the @dfn{range} of integers
>      > allowed in a
>      >      -field.  That can be achieved by using a range type declaration:
>      >      +field.  This can be achieved by using a range type declaration:
>      >       
>      >       @example
>      >       %typedef: Percentage_t range 0 100
>      >      @@ -1039,7 +1040,8 @@ ten, like for example:
>      >       @cindex fractions
>      >       @cindex floating point numbers
>      >       @noindent
>      >      address@hidden fields can be declared with the @code{real} type 
> specifier.
>      >      address@hidden number fields can be declared with the @code{real} 
> type
>      >      +specifier.
>      >       A wide range of real numbers can be represented this way, only 
> limited
>      >       by the underlying floating point representation.
>      >       @cindex decimal separator  
>      >      -- 
>      >      1.7.2.5




reply via email to

[Prev in Thread] Current Thread [Next in Thread]