gnuastro-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gnuastro-commits] master bb1580f 3/3: sort-by-night: dramatic improveme


From: Mohammad Akhlaghi
Subject: [gnuastro-commits] master bb1580f 3/3: sort-by-night: dramatic improvement in speed with new features
Date: Fri, 19 Feb 2021 23:07:07 -0500 (EST)

branch: master
commit bb1580fd82dd09560a255b60ac7c6ad9c9905631
Author: Mohammad Akhlaghi <mohammad@akhlaghi.org>
Commit: Mohammad Akhlaghi <mohammad@akhlaghi.org>

    sort-by-night: dramatic improvement in speed with new features
    
    Until now, astscript-sort-by-night primarily used the shell's 'awk' or
    'sort' programs to almost all its operations. As a result, it wasn't too
    efficient.
    
    With this commit, the new features that have been recently added to the
    Fits and Table programs are now used for the computational components of
    the job: with the new '--keyvalue' option of the Fits program, the keyword
    values are read from all the inputs in one command and that is piped to
    Table which does all the calculations using the new 'date-to-sec' and
    'set-' operators. Finally, Table is again used to sort the input at once
    and at the start, and to select the proper night.
    
    This resulted in a major improvement in the running speed of this script
    (as mentioned in the NEWS file, from from 19 seconds to 0.42 seconds for
    about 650 FITS files used in the test!).
---
 NEWS                          |  9 +++++
 bin/script/sort-by-night.in   | 80 ++++++++++++++++++++++++++-----------------
 doc/gnuastro.texi             |  8 ++---
 tests/script/list-by-night.sh |  6 ++++
 4 files changed, 67 insertions(+), 36 deletions(-)

diff --git a/NEWS b/NEWS
index 4e387b3..266231f 100644
--- a/NEWS
+++ b/NEWS
@@ -43,6 +43,15 @@ See the end of the file for license conditions.
 
 ** Changed features
 
+  astscript-sort-by-night:
+   - Thanks to the new features in the Fits and Table programs (described
+     above), the efficiency of this script has improved dramatically (from
+     19 seconds to 0.42 seconds for about 650 FITS files used in the
+     test!).
+   - The default end to a "night" is set to 11:00a.m. Until now it was
+     9:00a.m. But in some cases, calibration images may be taken after
+     that. So to be safer in general it was incremented by 2 hours.
+
   Library:
    - gal_fits_key_write_wcsstr: also takes WCS structure as argument.
    - gal_fits_key_read_from_ptr: providing a numerical datatype for the
diff --git a/bin/script/sort-by-night.in b/bin/script/sort-by-night.in
index 70e43b7..5e8cade 100644
--- a/bin/script/sort-by-night.in
+++ b/bin/script/sort-by-night.in
@@ -32,9 +32,9 @@ set -e
 # Default option values (can be changed with options on the
 # command-line).
 hdu=1
-hour=9
 copy=0
 link=0
+hour=11
 quiet=0
 key=DATE
 prefix=./
@@ -242,12 +242,30 @@ fi
 #
 # To do this, we'll convert the date into Unix epoch time (seconds
 # since 1970-01-01,00:00:00) and keep that with the filename.
-list=$(for f in $inputs; do
-           astfits $f --datetosec=$key --hdu=$hdu -q \
-               | awk '{h='$hour'; d=int($1/86400); \
-                       if(int($1)%86400<(h*3600)) n=d-1; else n=d; \
-                       print "'$f'", $1, n }'
-       done)
+#
+# A simple AWK expression for what we are doing here inside of Table
+# (where the first input column is the filename and the second is the
+# date in seconds):
+#
+# awk '{h='$hour'; d=int($2/86400); \
+#       if(int($2)%86400<(h*3600)) n=d-1; else n=d; \
+#       print $1, $2, n }'
+#
+# Finally, we are sorting all the images with the unix-second column
+# to make sure that they are ordered in observation order later.
+list=$(astfits --keyvalue=$key --hdu=$hdu $inputs --colinfoinstdout \
+               | asttable -cFILENAME \
+                          -c'arith '$key' date-to-sec' \
+                          -c'arith '$key' date-to-sec set-sec \
+                                   sec 86400 / int32 set-day \
+                                   day \
+                                     sec int32 86400 % '$hour' 3600  x lt \
+                                     day 1 - \
+                                   where' \
+                          --colmetadata=ARITH_8,NIGHT,counter,"Observing 
night." \
+                          --colinfoinstdout \
+               | asttable --sort=UNIXSEC --colinfoinstdout)
+
 
 
 
@@ -263,44 +281,42 @@ list=$(for f in $inputs; do
 
 
 
-# Get the uniqe nights from the previous step.
-unique=$(echo "$list" | awk '{print $3}' | sort | uniq | cat -n)
-
+# Get the uniqe nights from the previous step and give each night a
+# counter starting from 1. The output of this step will be two
+# columns: a counter, and the night number.
+#
+# We are using 'asttable' here to avoid issues with spaces in
+# directory names in the first line.
+unique=$(echo "$list" | asttable -cNIGHT | sort | uniq | cat -n)
+#echo "check: $unique"; exit 1
 
 
 
 
 # Find the FITS files of every unique day and sort them by observing
 # time within that day. We'll also initialize the night-counter to 1.
-counter=1
 echo "$unique" | while read l; do
 
     # Find all input files (and their Unix epoch time).
-    daynum_to=$(echo $l | awk '{print $1}')
-    daynum_from=$(echo $l | awk '{print $2}')
-    in_this_day=$(echo "$list" \
-                       | awk '$3=='$daynum_from' {print $1, $2}' \
-                       | sort -nk2 \
-                       | cat -n \
-                       | awk '{print $2,'$counter',$1}')
-
-    # Now that we know this night's files, we can take the proper action.
-    echo "$in_this_day" | while read L; do
-
-        # Set the necessary numbers.
-        infile=$(echo $L | awk '{print $1}')
-        night_num=$(echo $L | awk '{print $2}')
-        exposure_num=$(echo $L | awk '{print $3}')
+    night_to=$(echo $l | awk '{print $1}')
+    night_from=$(echo $l | awk '{print $2}')
+    in_this_night=$(echo "$list" \
+                         | asttable -cFILENAME --equal=NIGHT,$night_from)
+
+    # Now that we know this night's files, sorted by time, we can take
+    # the proper action (simply list, or copy or make links).
+    exposure_num=1
+    echo "$in_this_night" | while read infile; do
 
         # Make the outputs
-        outfile=$prefix"n"$night_num-$exposure_num.fits
-        if   [ $copy = 1 ]; then   cp $infile $outfile
+        outfile=$prefix"n"$night_to-$exposure_num.fits
+        if   [ $copy = 1 ]; then   cp     $infile $outfile
         elif [ $link = 1 ]; then   ln -fs $infile $outfile
-        else                       echo "$infile $night_num $exposure_num"
+        else                       echo "$infile $night_to $exposure_num"
         fi
 
-    done
+        # Increment the exposure number
+        exposure_num=$((exposure_num+1))
 
-    # Increment the night-counter.
-    counter=$((counter+1))
+    done
 done
diff --git a/doc/gnuastro.texi b/doc/gnuastro.texi
index 02bf075..82f5866 100644
--- a/doc/gnuastro.texi
+++ b/doc/gnuastro.texi
@@ -9434,7 +9434,7 @@ $ astscript-sort-by-night --link --prefix=img- 
/path/to/data/*.fits
 @end example
 
 This script will look into a HDU/extension (@option{--hdu}) for a keyword 
(@option{--key}) in the given FITS files and interpret the value as a date.
-The inputs will be separated by "night"s (9:00a.m to next day's 8:59:59a.m, 
spanning two calendar days, exact hour can be set with @option{--hour}).
+The inputs will be separated by "night"s (11:00a.m to next day's 10:59:59a.m, 
spanning two calendar days, exact hour can be set with @option{--hour}).
 
 The default output is a list of all the input files along with the following 
two columns: night number and file number in that night (sorted by time).
 With @option{--link} a symbolic link will be made (one for each input) that 
contains the night number, and number of file in that night (sorted by time), 
see the description of @option{--link} for more.
@@ -9467,7 +9467,7 @@ The keyword name that contains the FITS date format to 
classify/sort by.
 @item -H FLT
 @itemx --hour=FLT
 The hour that defines the next ``night''.
-By default, all times before 9:00a.m are considered to belong to the previous 
calendar night.
+By default, all times before 11:00a.m are considered to belong to the previous 
calendar night.
 If a sub-hour value is necessary, it should be given in units of hours, for 
example @option{--hour=9.5} corresponds to 9:30a.m.
 
 @cartouche
@@ -9482,8 +9482,8 @@ It is possible to take this into account by setting the 
@option{--hour} option t
 
 For example, consider a set of images taken in Auckland (New Zealand, UTC+12) 
during different nights.
 If you want to classify these images by night, you have to know at which time 
(in UTC time) the Sun rises (or any other separator/definition of a different 
night).
-In this particular example, you can use @option{--hour=21}.
-Because in Auckland, a night finishes (roughly) at the local time of 9:00, 
which corresponds to 21:00 UTC.
+For example if your observing night finishes before 9:00a.m in Auckland, you 
can use @option{--hour=21}.
+Because in Auckland the local time of 9:00 corresponds to 21:00 UTC.
 @end cartouche
 
 @item -l
diff --git a/tests/script/list-by-night.sh b/tests/script/list-by-night.sh
index e32ddf8..4871eeb 100755
--- a/tests/script/list-by-night.sh
+++ b/tests/script/list-by-night.sh
@@ -30,7 +30,9 @@
 # tested on a larger image.
 prog=sort-by-night
 dep1=fits
+dep2=table
 dep1name=../bin/$dep1/ast$dep1
+dep2name=../bin/$dep2/ast$dep2
 execname=../bin/script/astscript-$prog
 
 
@@ -47,6 +49,7 @@ execname=../bin/script/astscript-$prog
 #   - The programs it use weren't made.
 if [ ! -f $execname ]; then echo "$execname doesn't exist."; exit 77; fi
 if [ ! -f $dep1name ]; then echo "$dep1name doesn't exist."; exit 77; fi
+if [ ! -f $dep2name ]; then echo "$dep2name doesn't exist."; exit 77; fi
 
 
 
@@ -55,6 +58,9 @@ if [ ! -f $dep1name ]; then echo "$dep1name doesn't exist."; 
exit 77; fi
 # Put a link of Gnuastro program(s) used into current directory. Note that
 # other script tests may have already brought it.
 ln -sf $dep1name ast$dep1
+ln -sf $dep2name ast$dep2
+
+
 
 
 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]