bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

problem with GDBM 1.7.3/1.8


From: Tom Asprey
Subject: problem with GDBM 1.7.3/1.8
Date: Wed, 03 Apr 2002 13:41:12 MST

Gentlemen,

I do not have the time to put together a test case for this but thought
you might be interested anyway.

Since I was able to come up with a work-around to fix my problem with GDBM,
I do not need a fix but though my experience might give you some clues.
I'm hoping that if I describe what was happening and how I worked-
around it, you might easily be able to come up with a test case for it.
This issue looks related to the ChangeLog message (in the 1.7.3 ChangeLog) for
Sun Mar 13 22:56:10 1994 relating to file bloating problem the author mentions
but had not experienced.

My application uses the perl GDBM_File module to tie a hash to the GDBM DB.
While my record lengths are multiples of 512bytes, they are variable length
and can have a wide variation in record length.  The problem occured for
some specific cases where this variation was extreme.

The situation I debugged the most was a case with ~2000 records with one
record in the ~300KByte size but most of the other records in the 512/1024byte
size.  The DB file ended being over 100MBytes.  This did not make sense to me
given the amount of data being stored.  After investigating the situation, I
determined that data strings from the large record were being stored many
(~1700) times in the DB file.  This suggested to me some type of buffering
issue.

I knew that the long record was one of the first written.  So I tried a
work-around that simply cached all the data to be written to the DB in a
memory hash, sorted the records by length and then wrote them out from
shortest to longest into the DB.  This shrank the 100+MByte file to about
2MBytes.

Another case where I killed the DB creation after 24 hours and 600+MBytes,
ran in minutes and was about 5MBytes in size.  The results were similar for
the other cases as well.

So, I believe a test case can easily be created by creating varying length
record sizes and writing them out first in longest to shortest and comparing
this to writing them out in the opposite order.

One other thing about my application that may apply to this problem is
that my records always start as 512bytes and grow in 512byte chunks.  So,
the large records grew to be this large.  In the work-around this happens
in memory, not in the DB and could also be part of the problem solution.

My environment is HP-UX 10.20 and 11.11 running GDBM through perl.  I tried
both 1.7.3 and 1.8.

As I've said, this is not a problem for me now.  I'm only hoping this might
be useful to you should others be having problems with bloated files and
you are trying to debug this.

thanks,

tom

--
  ^-^ ^-^ ^-^ ^-^ ^-^ ^-^ ^-^ ^-^ ^-^ ^-^ ^-^ ^-^ ^-^ ^-^ ^-^ ^-^ ^-^
     /\            /\    Tom Asprey   Mail Stop: 32   Location: 6UG12
  /\/  \   ^  /\  /  \/\  _           3404 East Harmony Road
        \/\__/  \/ ^  \ \/ \          Fort Collins, CO 80528
        /   /    \  ^   /   \___ Tel: 970-898-4926  Fax: 970-898-3961



reply via email to

[Prev in Thread] Current Thread [Next in Thread]