[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#39885: Bioconductor tarballs are not archived
From: |
Simon Tournier |
Subject: |
bug#39885: Bioconductor tarballs are not archived |
Date: |
Mon, 19 Feb 2024 17:50:59 +0100 |
Hi,
On ven., 16 févr. 2024 at 10:14, Timothy Sample <samplet@ngyro.com> wrote:
>> Can we consider that this report is now done? Because:
>>
>> 1. SWH supports ExtID and nar hash lookup.
>>
>> 2. Missing origins are currently ingested by SWH.
>> (via specific sources.json)
>
> I think that would be jumping the gun a little bit.
>
> In some sense, the report is only *done* when “stored” hits 100% (or
> close to it, with the remainder being stuff we are pretty sure no longer
> exists). This won’t happen just because of your second point there.
Just to be sure: we are speaking about Bioconductor only, right?
> When the historical “sources.json” is loaded, things will be much, much,
> better, sure. Sources will still be missing, though.
Yeah, sources will still be missing but I expect that Bioconductor will
be not. The only issue is about “annotation” and maybe “experiment”.
However, here we are hitting the boundary between code and data:
annotation and experiment might be very large and potentially skipped by
SWH and they contain few if no code but plain data.
We can still discuss what to do here; in this already long thread. :-)
Or we can open another thread for this specific case about Bioconductor
annotation and experiment.
> To me, this is an
> invitation to more subtle analysis, like weighing sources by their
> “importance” in the package graph. Then there’s still shortcomings with
> Disarchive that have to be resolved (which is work best guided by
> numbers in the report).
Yeah. But that seems a large scope than Bioconductor case, no?
> Also, it will always be a good idea to verify that things are working.
> Ideally this could be simpler (leveraging ExtID lookup) and continuous.
Indeed, checking that all Bioconductor sources can be extracted from
SWH+Disarchive seems the path forward closing this report. :-)
Cheers,
simon