[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: hdf5

From: Daniel Heiserer
Subject: Re: hdf5
Date: Thu, 17 Aug 2000 08:54:21 +0200

Hi address@hidden,

see all below please.

> >I have tons of matrices I want to store and retrieve from
> >the 'database'. Is hdf5 the right thing or can it conly be
> >used for simple loading and saving data in a sequential way?
> HDF5 is certainly not limited to loading and saving data in a sequential
> way.  It supports essentially random read/write access, including efficient
> support for reading/writing subsets (hyperslabs) of a large matrix.  (In
> many ways, HDF5 is like a mini-filesystem.)  Not that you would want to
> replace a[x][y][z] with a single-element HDF5 read inside a critical loop.

Of course I do not want todo that.

> On the other hand, it is not the same as a real database if you want to do
> queries across millions of little objects, have fine-grained transaction
> atomicity, etcetera.  I personally don't have any experience in using HDF5
> for anything approaching that (it's not what it is designed for).
> If you want more details, a better place to ask is the HDF5 developers (who
> provide lots of documentation, and source code too)
> (
> Cordially,
> Steven G. Johnson
> PS. Someone also mentioned Octave's native file format.  This is
> essentially just a straight binary dump of the data.  That has the
> advantage of being read and write *everything*.  Even random
> reads are not possible (although a random-access directory could easily be
> compiled if necessary), but random writes are even more difficult (e.g. if
> some object is deleted or changes size).

That is exactly what I am looking for. I would like to add
a new datatype 'out of core', just the same we have, but all on disk.
I am handling with many, huge matrices. I am happy if one of them 
fits into core, or to say at least one column fits in core!

Octave is an excellent numerical package, the best I know.
But it has a lack of high end sparse matrix support.
It is not that sparse matrices couldn't be added, Andy Adler
did a great job, the thing is, that I am talking of BIG matrices
1e7 x 1e7 sparse. Once it comes to this, you have to add out of
core capability, because no one wants to buy that memory....
In the segment, efficient handling of out of core is extremely 
important. One thing you need is a database or something like that
to store your data. I would just like to know if there is already
some database like stuff available, which allows random access,
overwriting of existing files and deleting of them.

If someone is interested in that subject please let me know.

> PPS. Of course, the current Octave interface for writing HDF5 only supports
> writing the whole file, not appending to or changing a few matrices in an
> existing file...although this could be trivially changed, e.g. adding a
> "-a" flag.  Also, the way the Octave load function works (it assumes
> stream-based, non-random-access files), it actually reads every variable in
> the entire file and throws away the variables it doesn't want.  Again, this
> could be easily changed if desired for HDF5 loading since the format/API
> support random access.

thanks Daniel

Octave is freely available under the terms of the GNU GPL.

Octave's home on the web:
How to fund new projects:
Subscription information:

reply via email to

[Prev in Thread] Current Thread [Next in Thread]