[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [GNUnet-developers] Approximate Searches
From: |
Christian Grothoff |
Subject: |
Re: [GNUnet-developers] Approximate Searches |
Date: |
Wed, 24 Jun 2009 12:15:41 -0600 |
User-agent: |
KMail/1.11.4 (Linux/2.6.29-2-686; KDE/4.2.4; i686; ; ) |
I like this idea (at least as an option that should likely be the default) and
have added it to the list of things to change for 0.9.x. What I wonder if
sorting the consonants should be omitted or not. Some statistics on bad
collisions with and without sorting would probably be nice to have...
Christian
On Tuesday 23 June 2009 07:27:17 leo stone wrote:
> I believe the biggest factor on how we judge a system for future usability
> is how many results we get if we are looking for "something" like
> "something".
> Imagine a shoe shop, with only two pair of shoes in it. And one with a few
> hundreds.
>
> The result in the end might be the same you leave both shop's not finding
> what you want, but most people will consider
> the shop with a hundred pairs more promising and worth spending time next
> time they try to find some shoes.
>
> So making sure people are getting results in their searches is probably one
> of the more important issues, after
> my doubts about how the routing is handled.
>
> Even though it might mean some significant overhead, i would consider doing
> something like normalizing keywords.
> If it must be, per language but in the beginning English should be enough.
>
> So if i wanted to share the following file, and i would like it public, so
> people can find it, why not store it such:
>
> "Woh_the.fuck_is ALICe(2008).divx.avi.WMV" => { HW , HT , CFK , S , CL ,
> 2008 , DVX , V , MVW }
>
> Put the file under the hash's of those nine "key words".
>
> When i seach now for "fuck alice" => { CFK , CL }
>
> search h(CFK) AND h(CL) will return a lot of wrong similar results but
> them one can filter locally in a more elaborate way.
>
> It might even be more selective than search h(video/x-msvideo)
>
> At least it returns results, whereas "Woh_the.fuck_is
> ALICe(2008).divx.avi.WMV" as a key word is very unlikely that any one
> would think to search for and therefore never be found, never be spread
> ....., except by chance of course.
>
> regards leo