[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [help-GIFT] patch-fu

From: risc
Subject: Re: [help-GIFT] patch-fu
Date: Thu, 17 Aug 2006 09:42:18 -0500
User-agent: Mutt/1.4.1i

Agreed. howabout in my next set of patches, i apply something that creates
a MAX_WIDTH and MAX_HEIGHT define, that we use for spots that "need" a 
hard coded value at compile time, and continue to use width and height
for things that dont require hard coding?

As for the "research software" nature of gift, i have a HARD application
that requires indexing ~160K images every day, and performing ~300 queries.

I need to do this in UNDER an hour.

I've already touched up the 'gift-add-features' perl script to allow for
multiple machines doing feature recognition at once, but its still a 
very unwieldy (unrealistic) task.

Expect to correct me further in the future, as i'm trying to take the
'research' out of gift. ;)

Julia Longtin <address@hidden>

On Thu, Aug 17, 2006 at 02:45:03PM +0100, David Squire wrote:
> address@hidden wrote:
> >Jonas:
> >Yes, not doing a calloc() every time through improves performance by 
> >arround 5%.
> >
> >David,
> >
> >it is my intention to expand the image size handling of the feature 
> >extractor as soon as i'm done with performance hacks, however, the curent 
> >code wont properly process an image larger than 256x256. therefore, i 
> >think allocating
> >the array automatically, to the MAX_WIDTH and MAX_HEIGHT is appropriate.
> >  
> OK.... but I am concerned about a direction of modification that seeks a 
> few percentage points of speed improvement here and there at the expense 
> of good design (i.e. low coupling and extensibility) - not that I am 
> suggesting that the feature extraction code as it was is an example of 
> good design. We should be aiming always for good design.
> The above approach tightly couples parts of the code to others via an 
> essentially arbitrary choice of image size, that *happens* to be 
> constant in the current version, but need not be. I would be *much* 
> happier seeing things dynamic and parameterized, even if that makes 
> things a little slower.
> Remember the "rules" of optimization:
> 1. Don't optimize by hand.
> 2. If you think you need to optimize by hand, think again.
> 3. If you still really think you need to optimize by hand, optimize late.
> This is particularly true of research software (such as the GIFT 
> essentially is), where the requirements are moving targets. For example, 
> I have my own versions of the feature extraction code that also look for 
> and index text files associated with the images. The feature extraction 
> code is intended to be as separate as possible from the indexing and 
> query engine, so that others can write and use their own feature 
> extraction code, using whatever features they like.
> Also, the typical IR scenario is that you extract features and index 
> once, and query many times. Consequently, optimizing query performance 
> is much more important than optimizing feature extraction. Users are 
> often quite happy for indexing to take hours or days.
> I am not saying that optimization of the feature code is not 
> appreciated, just that the trade-offs need to be kept closely in mind. 
> IMHO, extending to arbitrary image sizes and shapes first would make 
> more sense.
> Regards,
> David
> -- 
> Dr David McG. Squire, Senior Lecturer, on sabbatical in 2006
> Caulfield School of Information Technology, Monash University, Australia
> CRICOS Provider No. 00008C
> _______________________________________________
> help-GIFT mailing list
> address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]