h5md-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [h5md-user] Pre-averaged observables


From: Pierre de Buyl
Subject: Re: [h5md-user] Pre-averaged observables
Date: Wed, 29 May 2013 10:04:23 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

Hi,

On Wed, May 15, 2013 at 12:10:13PM +0200, Felix Höfling wrote:
> sometimes, one wants to store pre-averaged observables, i.e.
> accumulated over a certain time span. For example, compute the
> pressure every 1000 steps and compute the mean from 10 values, i.e.
> writing the data only every 10000 steps. Such a functionality is
> provided by LAMMPS and recently also by HALMD.
> 
> http://lammps.sandia.gov/doc/fix_ave_time.html
> http://halmd.org/modules/observables/utility/accumulator.html

The idea is interesting.
 
> Now my question: how shall such data be stored in the H5MD
> observables group? Along with the mean value, one would like to
> store also the standard error (or the variance) and the number of
> accumulated values. One scheme would be to distribute this
> information over several groups under the roof of the observable's
> name:
> 
> obs1
>   \-- mean
>   |    +-- count
>   |    \-- value
>   |    \-- step
>   |    \-- time
>   |
>   \-- error_of_mean
>   |    +-- count
>   |    \-- value
>   |    \-- step
>   |    \-- time
>   |
>   \-- count
>        +-- count
>        \-- value
>        \-- step
>        \-- time
> 
> The obvious drawback is that the structure is pretty nested and that
> pre-averaged observables have a disjoint structure from plain
> observables, e.g., the mean value is obs1/mean/value in one case and
> obs1/value in the other. Further, the step/time fields show up
> repeatedly (although they may link each other.)
> 
> A second scheme would extend the existing value/step/time triple to
> include the error and the number:
> 
> obs1
>   +-- count
>   \-- value
>   \-- error
>   \-- count/number/samples ???
>   \-- step
>   \-- time
> 
> This scheme appears more natural to me and I would prefer it. In
> addition, one may add "variance" and "standard_deviation". There is,
> however, a naming clash between the attribute or dataset "count" for
> the number of particles and the number of accumulated
> values/samples.
> 
> Nicolas Höft noted on the halmd-devel mailing list that "count" for
> the number of particles is not very descriptive, may we change it to
> "size" or "number"?
> http://article.gmane.org/gmane.science.simulation.halmd.devel/292
> 
> The whole issue may be beyond the current release candidate. I
> mainly would like to hear your opinion at an early stage.

It seems premature to me also. Anyway, as far as early opinions are concerned, I
prefer the second scheme in which all of that can be optional and one can read
the step/time/value as usual and query the other datasets if appropriate. I
think that all "extra" features should leave the basic organization
untouched.

Regards,

Pierre



reply via email to

[Prev in Thread] Current Thread [Next in Thread]