h5md-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [h5md-user] Pre-averaged observables


From: Felix Höfling
Subject: Re: [h5md-user] Pre-averaged observables
Date: Wed, 29 May 2013 12:01:23 +0200
User-agent: Opera Mail/12.15 (Linux)

Am 29.05.2013, 10:04 Uhr, schrieb Pierre de Buyl <address@hidden>:

Hi,

On Wed, May 15, 2013 at 12:10:13PM +0200, Felix Höfling wrote:
sometimes, one wants to store pre-averaged observables, i.e.
accumulated over a certain time span. For example, compute the
pressure every 1000 steps and compute the mean from 10 values, i.e.
writing the data only every 10000 steps. Such a functionality is
provided by LAMMPS and recently also by HALMD.

http://lammps.sandia.gov/doc/fix_ave_time.html
http://halmd.org/modules/observables/utility/accumulator.html

The idea is interesting.

Now my question: how shall such data be stored in the H5MD
observables group? Along with the mean value, one would like to
store also the standard error (or the variance) and the number of
accumulated values. One scheme would be to distribute this
information over several groups under the roof of the observable's
name:

obs1
  \-- mean
  |    +-- count
  |    \-- value
  |    \-- step
  |    \-- time
  |
  \-- error_of_mean
  |    +-- count
  |    \-- value
  |    \-- step
  |    \-- time
  |
  \-- count
       +-- count
       \-- value
       \-- step
       \-- time

The obvious drawback is that the structure is pretty nested and that
pre-averaged observables have a disjoint structure from plain
observables, e.g., the mean value is obs1/mean/value in one case and
obs1/value in the other. Further, the step/time fields show up
repeatedly (although they may link each other.)

A second scheme would extend the existing value/step/time triple to
include the error and the number:

obs1
  +-- count
  \-- value
  \-- error
  \-- count/number/samples ???
  \-- step
  \-- time

This scheme appears more natural to me and I would prefer it. In
addition, one may add "variance" and "standard_deviation". There is,
however, a naming clash between the attribute or dataset "count" for
the number of particles and the number of accumulated
values/samples.

Nicolas Höft noted on the halmd-devel mailing list that "count" for
the number of particles is not very descriptive, may we change it to
"size" or "number"?
http://article.gmane.org/gmane.science.simulation.halmd.devel/292

The whole issue may be beyond the current release candidate. I
mainly would like to hear your opinion at an early stage.

It seems premature to me also. Anyway, as far as early opinions are concerned, I prefer the second scheme in which all of that can be optional and one can read the step/time/value as usual and query the other datasets if appropriate. I
think that all "extra" features should leave the basic organization
untouched.

Regards,

Pierre


Hi Pierre,

Thanks for the feedback. So we will keep in mind the optional fields "error", "count" etc. until a later revision and may test/use them privately until then. To avoid the name clash I will change the attribute "count" to "particles".

Felix



reply via email to

[Prev in Thread] Current Thread [Next in Thread]