guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Improving ‘guix search’ scoring


From: zimoun
Subject: Re: Improving ‘guix search’ scoring
Date: Thu, 18 Jul 2019 13:11:32 +0200

Hi,

On Wed, 17 Jul 2019 at 23:27, Ludovic Courtès <address@hidden> wrote:

> I guess computing the TF-IDF could perhaps improve the results compared
> to the current scoring mechanism.  It would be worth trying to implement
> it.
>
> The bottom line though, as you wrote, is that this all depends on the
> quality of synopses and descriptions, and there’s only so much we can
> draw from 5-line descriptions.

>From my opinion, because the description is say 5 lines plus the
synopsis, before implementing something, one needs to first analyse
the "quality" of the available information (words + dependencies). I
mean doing some "data science" (buzz buzz! :-)) with R or Python.
And I do not know the state-of-art of recommender systems. Neither
applied to packages retrieval. I have never read something about that
in other distributions (Debian, Gentoo, etc.). Someone does? Any
pointer?

For example, the current scoring looks like a poor man version of the
Boolean model of Information Retrieval [1]. What about the Okapi model
[2]? etc.

Well, if a student is reading this thread and is looking for a project. ;-)

And I will try to give a look after my summer holidays.
Please share your opinion or experience.


All the best,
simon


[1] https://en.wikipedia.org/wiki/Boolean_model_of_information_retrieval
[2] https://en.wikipedia.org/wiki/Okapi_BM25



reply via email to

[Prev in Thread] Current Thread [Next in Thread]