h5md-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [h5md-user] Writing vs reading


From: Felix Höfling
Subject: Re: [h5md-user] Writing vs reading
Date: Fri, 24 Jan 2014 09:58:29 +0100
User-agent: Opera Mail/12.16 (Linux)

Am 23.01.2014, 19:29 Uhr, schrieb Olaf Lenz <address@hidden>:

Hi!

2014/1/14 Pierre de Buyl <address@hidden>

> The most important feature for me is that any given combination of
> data arrays has a clear and unambiguous meaning. That criterion still
> leaves a lot of freedom, where as you say the question is whose life
> we want to simplify most.


I'd like to add some thoughts. It is most easy to discuss this on an actual
example, therefore I'd like to review the discussion on how to store
positions.

First of all, as Felix remarked, the most important criterion for h5md
seems to be **flexibility**: h5md should allow to express everything that
people could possibly want. We do not want to sacrifice any flexibility in exchange for reader-friendliness. For example, in the case of the particle
positions we deemed it to be important that the number of particles can
change, even though this makes writing and reading them more complex.

What remains open is the question what happens in those cases where there
are more than one possibilities to express the same thing? When there are
several alternative representations of the same thing? As an example, think
of writing the positions in periodic boundary conditions. Currently, we
allow for three methods to store the position: using the "image", not using
the image, forcing the positions to be within the central image, or not.
All of these variants are equivalent: one can always transform between the
different variants. This is the "writer-friendly" approach.

For a reader, this makes it more complex to read the file. In a
"reader-friendly" approach, we would allow only one variant of these, e.g.
not using the "image" at all, and always storing the absolute position,
even if it is outside the central image. For a reader, that would be
definitely the simplest variant, as he can always transform into his own
model. Insofar I wonder whether it wouldn't be the best to simply throw out
the "image" from the specs.

Olaf


The situation is indeed somewhat confusing because of the various
possibilities. Note that this arbitrariness is expressed now clearly in
the spec (since a recent update):

        "..., the data indicate for each particle the component k of the 
absolute
position in space of an arbitrary periodic image of that particle."

As far as I remember, the main reason for adding the image is that the
simulation program can dump the data directly from memory, without sorting
or extending positions to absolute values. This is desirable because of:

i) Folding and unfolding particle positions comes with round-off errors,
in particular for large box sizes (which become more amenable with
progress in computing power). This means that the last digits are simply
thrown away. As consequence, H5MD could not serve for storing simulation
snapshots which allow for faithfully resuming the simulation.

ii) Performance when writing. I consider this aspect not important: disc
access is slow anyway and should be minimised. Further, it seems better to
let the (potentially parallelised) writer software do these operations
rather than a reader.

In general, I think that the spec has reached a relatively broad consensus
now. I would prefer to mainly fine-tune the wording and avoid drastic
changes. I feel that dropping the image field would be of the latter kind.

Felix



reply via email to

[Prev in Thread] Current Thread [Next in Thread]