help-gift
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [help-GIFT] patch-fu


From: risc
Subject: Re: [help-GIFT] patch-fu
Date: Thu, 17 Aug 2006 08:25:55 -0500
User-agent: Mutt/1.4.1i

On Thu, Aug 17, 2006 at 07:55:33AM +0200, Jonas Lindqvist wrote:
> Hi!
> 
> I feel an urge to ask some questions... (Perhaps silly, but anyway...  
> I know I could probably find the answers by digging in the code a bit  
> deeper, but I admit I'm lazy...)
> 
> * The function gabor_filter, in gabor.c, now uses a fixed array of  
> 65536 doubles, instead of callocing the size indicated by the width  
> and height that are passed as parameters to gabor_filter(...). Very  
> well...
> Are the width and height always 256, or can they be 128*512 or  
> 2*32768 or whatever?
> 

the perl script that imports our images re-sizes everything to 256x256.

> * I guess that most modern CPUs have some kind of SSE2-ish features  
> that gcc could use, but what would the effect of the patch be for an  
> architecture that lacks it? (Something seriously old, pre MMX, or  
> something else that perhaps one would not use for this application  
> anyway...)

GCC's autovec extentions are also written to take advantage of altivec,
on the newer powerPC mac hardware.. AND splitting it up like this
still makes things run faster on earlier processors.

> 
> * Wouldn't memset be faster than looping and setting to zero?:
>       for (i = 0; i < width*height; i++)
>         {
>           conv[i]= 0;   /* needs to be zeroed */
>       }
> and isn't width*height always 65536?

possibly. i'll look into memseting. ;)

Julia Longtin <address@hidden>

> 
> /Jonas
> 
> 
> 
> 
> 
> >Yet another patch..
> >
> >This one is a case of "add code to gain speed".
> >
> >By adding a special case handling of "kernel being tested
> >is within the borders of the image",you gain speed by not
> >performing a conditional in the loop, which allows for
> >gcc 4.1's autovec engine to change your loop to a series
> >of SSE2 instructions.
> >
> >Get all that? :)
> >
> >Anyways, this brings the speed down to ~1.3 on my machine.
> >
> >the 70-* and 80-* patches have equivilents for convolutions
> >two and four, i'm just not getting it quite right yet...
> >
> >I'm also attaching my testing program, so that others can
> >check my work.
> >
> >Julia Longtin <address@hidden>
> ><80-ChangeLog>
> ><80-FeatureExtraction_gabor.c_move_conditional_and_parallelize.patch>
> >_______________________________________________
> >help-GIFT mailing list
> >address@hidden
> >http://lists.gnu.org/mailman/listinfo/help-gift
> 






reply via email to

[Prev in Thread] Current Thread [Next in Thread]