[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug #59904] Large aperture can easily fill memory in sort-based match
From: |
Mohammad Akhlaghi |
Subject: |
[bug #59904] Large aperture can easily fill memory in sort-based match |
Date: |
Mon, 18 Jan 2021 08:08:13 -0500 (EST) |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0 |
URL:
<https://savannah.gnu.org/bugs/?59904>
Summary: Large aperture can easily fill memory in sort-based
match
Project: GNU Astronomy Utilities
Submitted by: makhlaghi
Submitted on: Mon 18 Jan 2021 01:08:10 PM UTC
Category: Match
Severity: 3 - Normal
Item Group: Crash
Status: Confirmed
Privacy: Public
Assigned to: None
Open/Closed: Open
Discussion Lock: Any
_______________________________________________________
Details:
The sort-based match algorithm which is currently used in the Match program
makes a linked list of nearby points (within the given aperture) between the
two catalogs to find the best match between them. This is necessary to make
sure that the order of the input catalogs doesn't affect the final result (see
the comments in the 'match_coordinates_second_in_first' function for more).
However, this has a bad side-effect: when there are many points in both
catalogs (for example x100000) and the aperture is large (by mistake
usually!), the created lists for each point in each catalog can easily fill
the whole system RAM causing Match to crash!
For example, with these two commands we can download the ID, RA and Dec of the
same region (randomly selected) in Gaia DR2 and eDR3 (each is 72Mb, with more
than 2 million rows):
astquery gaia --dataset=dr2 -csource_id,ra,dec --center=281.6553922,11.4038964
--radius=2 -odr2.fits
astquery gaia --dataset=edr3 -csource_id,ra,dec
--center=281.6553922,11.4038964 --radius=2 -oedr3.fits
When we later try to match these two by RA and Dec with the command below, the
RAM consumption will exceed 10GB and cause a crash on many systems!
astmatch dr2.fits edr3.fits --ccol1=ra,dec --ccol2=ra,dec --aperture=1
To fix this problem, we should avoid keeping the more distant elements in the
list and only keep the top N nearest elements (for example N=10). We just have
to use a structure like Gnuastro's "Ordered list of size_t" structure:
'gal_list_osizet_t'.
Until this problem is fixed, to avoid the problem, you should decrease the
aperture size to physically meaningful values (in the case of Gaia, something
like 0.1 arcsec ('--aperture=0.5/3600').
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?59904>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
- [bug #59904] Large aperture can easily fill memory in sort-based match,
Mohammad Akhlaghi <=