guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Outreachy: Timeline tasks


From: Christopher Baines
Subject: Re: Outreachy: Timeline tasks
Date: Wed, 28 Apr 2021 19:17:51 +0100
User-agent: mu4e 1.4.15; emacs 27.1

Luciana Lima Brito <lubrito@posteo.net> writes:

> I was thinking about the timeline of tasks.
>
> The main tasks are:
>
> 1. Add instrumentation to identify the slow parts of processing
>   new revisions
>
> 2. Improve the performance of these slow parts
>
> I'm writing some ideas I have to divide the tasks in small steps, see
> what you think about it.
>
> About the first task I understand that the whole thing starts with
> identifying how the data for new revisions arrives on Guix Data
> Service: the relevant queries and their processing on the code. Based
> on it I would propose start with mapping these queries and their uses,
> so I could run them locally and get their statistics.
>
> Once I get this information I could identify which are the possible
> problematic ones and work on them. If the process is slow but the query
> is not, maybe the problem would be hidden in the code.

So, there's already some code for timing different parts of the data
loading process, if you look in the job output and search for ", took "
you should see timings printed out.

These timings being printed out does help, but having the information in
the log doesn't make it easy to figure out which part is the slowest for
example.

I'd also not consider this a "one off" thing, the data loading code will
continue to change as Guix changes and it's performance will probably
change too.

I've been wondering about visualisations, I remember systemd had a
feature to plot the systems boot as a image which made seeing which
parts are slow much easier (here's an example [1]).

1: https://lizards.opensuse.org/wp-content/uploads/2012/07/plot001.gif

> About the improvements on the performance of slow parts, it is a little
> bit abstract for me to see now how to break it in smaller tasks. I do
> believe that it would require to reformulate some parts of the queries,
> and as their result may change a bit, tweaks could be required on
> the code too. My point is, how would I propose an improvement approach
> if I don't even know what exactly is to be improved? But I imagine that
> work on this second task is more demanding than the first and will take
> most of the time of the internship.

As I said before, this part is dependent on deciding where the areas for
improvement are. Maybe have a look through one of the job logs on
data.guix.gnu.org and see if you can spot some slow parts?

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]