h5md-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [h5md-user] units module


From: Konrad Hinsen
Subject: Re: [h5md-user] units module
Date: Thu, 17 Oct 2013 15:16:40 +0200

Pierre de Buyl writes:

 > So, to get back to Peter's message:
 > 
 > I propose that we follow udunits grammar by restricting it similarly to 
 > Mosaic.
 > For reference, Mosaic's definition is
 > """
 > The value of the units field is a text string in ASCII encoding. It contains 
 > a
 > sequence of unit factors separated by a space. A unit factor is a unit symbol
 > optionally followed by a non-zero integer which indicates the power to which
 > this factor is taken.
 > """
 > 
 > I would remove the constants defined ("c" and "Nav"), however.

The current unit list is a first draft, to be revised before version
1.0 of Mosaic. You are completely right about "Nav", which is the same
as "mol" and thus redundant. However, "c" frequently occurs in derived
unit, e.g. "cm-1 c" for frequency, which is heavily used in
spectroscopy.

 > We may want to add "a unit string must be parseable by udunits"?

The problem with that statement is that we don't control udunits.  In
general, it's not a good idea to define a data format by the
capacities of a piece of software. It's fine to have such a comment as
a statement of intention, of course.


Felix Höfling writes:

 > I find udunits' grouping into SI-base units, SI-derived units etc. very  
 > reasonable. Let's keep it for H5MD rather than introducing a different  
 > subset.

That was my original idea for Mosaic, but I changed my mind for the
following reasons:

1) The point of having a restricted set of units is to permit error
   checking. Allowing a unit that is more likely to be a typo than
   a choice is ultimately of no benefit. A general-purpose library
   such as udunits can't limit the allowed units, but a domain-specific
   format such as Mosaic can.

2) The distinction between SI-base and SI-derived is logical for a
   metrologist, but irrelevant for practical use. I don't expect
   SI-base to be sufficient for much of molecular data, if only
   because of the lack of energy units.

3) Fewer units means a reduced risk of errors if automatic conversion
   is attempted (see below).

 > Actually, whether a reader can "understand" a small or large set of units  
 > is mainly a matter of the database defining the units. Do I overlook  
 > something here? Why not copying the full list from udunits?

See 1) above.

 > BTW, a more advanced functionality that discriminates between "simple" and  
 > "advanced" readers is automatic conversion between units ...

Indeed, but conversion is a very tricky business. SI has two traps for
unit converters:

 - Dimensionless units: rad, sr, and mol

   Is pi dimensionless or measured in rad? Both make sense, and automatic
   conversion needs to know which convention was used.

   I am actually considering to remove "rad" from the allowed units in
   Mosaic, and make "deg" a dimensionless constant equal to 180/pi.
   That's much closer to the reality of unit use in computational chemistry
   than the SI system.

 - Dimensionally equal but incompatible units: 1/s, Hz, Bq

   It's OK to convert Hz and Bq to 1/s, but not among each other.
   Converting 1/s to Hz or Bq is in general not allowed. The problem
   disappears if Hz and Bq are not allowed.

Konrad.
-- 
---------------------------------------------------------------------
Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: research AT khinsen DOT fastmail DOT net
http://dirac.cnrs-orleans.fr/~hinsen/
ORCID: http://orcid.org/0000-0003-0330-9428
Twitter: @khinsen
---------------------------------------------------------------------



reply via email to

[Prev in Thread] Current Thread [Next in Thread]