help-gift
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [help-GIFT] Fwd: Big database on GIFT. gift-generate-inverted-file e


From: Wolfgang Müller
Subject: Re: [help-GIFT] Fwd: Big database on GIFT. gift-generate-inverted-file error.
Date: Wed, 25 Feb 2004 09:01:23 +0100
User-agent: KMail/1.5

Forwarded replies by myself, to make discussion public:

Reply #0:

> Dear all,
>
> I'm using gift-0.1.9 and has a image database which has 678929 images
> (about 20G in size).
Am I allowed to answer the question on the list?
Wolfgang

Reply #1 (short teaser for #2 ;-) ):

> __FILE__:__LINE__: lToBeSorted false, after seekg(0)
> __FILE__:__LINE__: lToBeSorted false, after seekg(-2147483648)
> __FILE__:__LINE__: lToBeSorted false, after seekg(0)
> __FILE__:__LINE__: lToBeSorted false, after seekg(-2147483648)
OK so what you need is just to know the type of the parameter of seekg. And it 
would be interesting to know the type of streampos on your system.
Cheers,
Wolfgang

Reply #2:

On Tuesday 24 February 2004 17:31, Henning Müller wrote:
> I think that there might be a limit with respect to the numbers of
> extracted and to be sorted  features. As the pointer is negative, this
> might be a problem.
Yes, I think this is the problem.

> Wolfgang should in case know better.

Thanks. I do think that this is fixed in more recent GIFTs (maybe surf over 
tohttp://savannah.gnu.org/cvs/?group=gift and follow the instructions for 
"anonymous CVS". CVS should come with a complete cygwin or Linux install and 
should be easy to get on Mac, too) . Please do test the size of the streampos 
type on your machine. If it's ==4 than you're doomed, you need another 
operating system or do what Henning suggests. If not, the fix is relatively 
easy and probably known.

> I do not think that anybody ever used such a large number of images with
> gift. 

Record so far is a couple of 100k, if I remember correctly.

> How far away are you form the maximum filesize of your Linux
> system? An how much space is on your hard disk?

HDspace I regard as an extremely improbable. But it does not hurt asking...

> If you would like to parallelize the inverted files anyways, why don't
> you just generate the five smaller inverted files and have a sort of
> meta-search engine that starts five queries on five different machines.
>
> Then you could merge the results.
>
> This will surely lead to different results than one big inverted file

Not necessarily. If you split a 5-NN query into a couple 5NN queries (each 
running on one partial index) [I meant to say: a index containing all features 
for a partial collection] and then take the top 5 of the merged result, 
you're safe. Google works this way.

> but will surely be computed more efficiently.

Yup.

Can this go to the list to show that we're alive?
Cheers,
Wolfgang








reply via email to

[Prev in Thread] Current Thread [Next in Thread]