On Fri, Dec 06, 2013 at 05:45:15PM +0100, Felix Höfling wrote:
In conclusion, there seems to be a preference for fixed-length strings
in
h5py, but a preference for VL Strings in H5MD. Further, a novice user
may
easily mess up the string types, e.g., this issue is discussed in
"Special
topics" in the h5py manual ... not for beginners. This made me wondering
again if forcing strings to be of VL type is a good idea, but I know
that
this is a majority opinion.
The novice user would probably store the wrong string type either way.
import h5py
f = h5py.File("/tmp/vlstring.h5", "w")
f.attrs["unit"] = "nm"
f.attrs["boundary"] = ["periodic", "periodic", "nonperiodic"]
HDF5 "vlstring.h5" {
GROUP "/" {
ATTRIBUTE "boundary" {
DATATYPE H5T_STRING {
STRSIZE 11;
STRPAD H5T_STR_NULLPAD;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
DATASPACE SIMPLE { ( 3 ) / ( 3 ) }
DATA {
(0): "periodic\000\000\000", "periodic\000\000\000", "nonperiodic"
}
}
ATTRIBUTE "unit" {
DATATYPE H5T_STRING {
STRSIZE H5T_VARIABLE;
STRPAD H5T_STR_NULLTERM;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
DATASPACE SCALAR
DATA {
(0): "nm"
}
}
}
}
For strings, h5py uses a variable-length string type. For arrays of
strings, h5py uses a fixed-length string type. Thus if you are a
novice user, better use pyh5md.
Peter