bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#50686: Show number of downloads on packages on GNU ELPA/NonGNU ELPA


From: Stefan Monnier
Subject: bug#50686: Show number of downloads on packages on GNU ELPA/NonGNU ELPA
Date: Mon, 11 Mar 2024 18:13:28 -0400
User-agent: Gnus/5.13 (Gnus v5.13)

>>>> I had the logs only for a two weeks or so (plus some old logs from
>>>> many years ago, actually), indeed.
>>> I see.  Are the rest of the logs still available on the ELPA server, or is
>>> that all we have for historical data?
>> That's all we have.
> Ok.  Going forward, will the logs we have now be preserved, or do they get
> rotated away?

They get rotated away.  We do keep the weekly counts that we accumulate
in our `wsl-stats.eld` file.

>>>>> a list of downloads per version, etc.
>>>> Currently I count the "interest" in the package, so I don't distinguish
>>>> the version of the package, nor whether the access is for the tarball or
>>>> the package's web page, or the package's readme.txt, or the package's 
>>>> badge.
>>> That seems like a very different kind of data than the number of times
>>> a package has been downloaded (i.e. by an Emacs instance).  IME a small
>>> fraction of hits to a package's GitHub repo seem to result in installations;
>>> "interest" tends to be far more than "interested enough to install."
>> Just because the "interest" tends to be far more than "interested enough
>> to install" doesn't mean that the two aren't strongly correlated.
>> Also my impression is that package web pages in `elpa.gnu.org` are not
>> visited nearly as often as a Github project page.
>> But it'd be definitely worth checking how the two measures compare.
>> Patches welcome.
> Ok, meaning that you'd accept a patch that does...what, exactly, to the
> database?  :)

I guess keep separate counts for tarballs and other files, so we can compare?

>>>> I'd like to the keep the stats database reasonably small (it's currently
>>>> around 150kB,  and I expect it'll take a year before it reaches 1MB), so
>>>> I'd rather not segregate per version.
>>> Is there a way that I could change your mind about that?  Having the actual
>>> download counts per version would be very useful.
>> Maybe if you argue about what kind of use would make it useful?
>
> For example, if a package at version V has N downloads after 6 months, and
> then the package is updated to version V+1, how many downloads that version
> has after 6 months would give some indication of whether the package is
> growing in popularity, whether initial users are still using it and
> upgrading it, or whether it's falling out of favor.  And, over time, that
> might help determine whether an obsolete package should be removed
> from ELPA.

Ah, so as to factor out the fact that frequently updated packages will
naturally see more downloads?  I guess that would make sense.

Not completely sure how to write the code, tho: I can see how to go and
dig in the numbers to answer "is the new version less/more popular than
the old one", but not how to use that insight to adjust the percentile
ranking of the package.

>> My goal was mostly to show relative popularity, so when you search for
>> packages providing a given feature and you find 4 different options, the
>> rank percentile can give you an idea of which one is more popular.
>
> That's definitely a worthy goal.
>
> Another goal that's relevant to me, as a package author, is to determine
> whether a package of mine is still in use at all.  For example, my package
> org-ql is intended to subsume my older package, org-rifle, but I hear now
> and then about people who still use org-rifle.  Eventually I'd like to see
> that the downloads of org-rifle fall off to the point that I could declare
> it an archived, obsoleted package, but I don't want to do that prematurely.
> (Those packages are on MELPA, but the principle applies regardless.)

Right.  I guess it would be hard to do because of the mirroring-style
downloads, so even the least popular package still gets downloads.

It's not super high on my todo list for now, but if you're interested in
improving this, I'll be happy to take your patches, install them and let
you play with it to see what comes up.

Currently the `wsl-states.eld` "database" is not exposed on the web
site, part of it is because it contains some "irrelevant" entries
(accesses to non-existing files, some of them very much on purpose
because their names look like "<RANDOM>_nonexisting") which may contain
information I'd rather not expose.  We should try and sanitize it first
to only keep things which do correspond to existing packages/files
(which will also improve the quality of the rankings).


        Stefan






reply via email to

[Prev in Thread] Current Thread [Next in Thread]