bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

gnulib-tool.py: Optimize module set lookups


From: Bruno Haible
Subject: gnulib-tool.py: Optimize module set lookups
Date: Thu, 11 Apr 2024 21:51:05 +0200

The func_transitive_closure function in the shell implementation can take
a lot of time. So I wondered whether in the Python implementation, there
is room for speedup at this place as well. And indeed, there is.

Profiling the test-create-testdir-4.sh unit test, I saw this
profiler output:

+         43843917 function calls (43842963 primitive calls) in 201.018 seconds
+
+   Ordered by: internal time
+
+   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
+       11  181.248   16.477  181.248   16.477 {built-in method posix.waitpid}
+ 25009065    3.666    0.000    3.666    0.000 GLModuleSystem.py:225(__eq__)
+    97005    2.783    0.000    6.664    0.000 GLModuleSystem.py:184(__init__)
+     7891    1.504    0.000    4.472    0.001 GLModuleSystem.py:915(<listcomp>)
+   107597    1.056    0.000    1.056    0.000 {built-in method io.open}
+   327094    1.043    0.000    1.509    0.000 posixpath.py:338(normpath)
+   316696    0.857    0.000    1.108    0.000 posixpath.py:71(join)
+   194049    0.610    0.000    0.610    0.000 {method 'read' of 
'_io.BufferedReader' objects}
+   246396    0.567    0.000    0.567    0.000 {built-in method posix.stat}
+    97013    0.454    0.000    0.587    0.000 codecs.py:681(__init__)
+   316505    0.428    0.000    3.048    0.000 constants.py:269(joinpath)
+    97013    0.416    0.000    2.005    0.000 codecs.py:871(open)
+     1583    0.354    0.000   16.804    0.011 
GLModuleSystem.py:814(transitive_closure)
+        1    0.299    0.299    0.903    0.903 GLModuleSystem.py:954(<listcomp>)
+    97007    0.284    0.000    0.986    0.000 codecs.py:451(read)
+    97005    0.240    0.000   10.376    0.000 GLModuleSystem.py:99(find)
+    44061    0.228    0.000   10.806    0.000 
GLModuleSystem.py:537(getDependenciesWithConditions)
+  2786508    0.212    0.000    0.212    0.000 {method 'append' of 'list' 
objects}
+   102296    0.193    0.000    1.353    0.000 GLFileSystem.py:77(lookup)
+   489239    0.190    0.000    0.271    0.000 GLModuleSystem.py:257(__hash__)
+        1    0.178    0.178  200.996  200.996 main.py:120(main)

So, I changed a list to a set in two places. As expected, this nearly
annihilates the 25 million GLModule.__eq__ invocations:

+         19010970 function calls (19010016 primitive calls) in 195.304 seconds
+
+   Ordered by: internal time
+
+   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
+       11  181.039   16.458  181.039   16.458 {built-in method posix.waitpid}
+    97005    2.770    0.000    6.676    0.000 GLModuleSystem.py:184(__init__)
+   107597    1.064    0.000    1.064    0.000 {built-in method io.open}
+   327094    1.016    0.000    1.486    0.000 posixpath.py:338(normpath)
+   316696    0.857    0.000    1.111    0.000 posixpath.py:71(join)
+   194049    0.585    0.000    0.585    0.000 {method 'read' of 
'_io.BufferedReader' objects}
+   246396    0.556    0.000    0.556    0.000 {built-in method posix.stat}
+    97013    0.454    0.000    0.597    0.000 codecs.py:681(__init__)
+   316505    0.424    0.000    3.026    0.000 constants.py:269(joinpath)
+    97013    0.414    0.000    2.023    0.000 codecs.py:871(open)
+     1583    0.331    0.000   12.204    0.008 
GLModuleSystem.py:814(transitive_closure)
+    97007    0.284    0.000    0.962    0.000 codecs.py:451(read)
+    97005    0.243    0.000   10.375    0.000 GLModuleSystem.py:99(find)
+    44061    0.224    0.000   10.795    0.000 
GLModuleSystem.py:537(getDependenciesWithConditions)
+  2786514    0.213    0.000    0.213    0.000 {method 'append' of 'list' 
objects}
+   504489    0.197    0.000    0.277    0.000 GLModuleSystem.py:257(__hash__)
+   102296    0.190    0.000    1.343    0.000 GLFileSystem.py:77(lookup)
+        1    0.187    0.187  195.283  195.283 main.py:120(main)

The speed improvement is also measurable without a profiler:

Before:
$ GNULIB_TOOL_IMPL=py time ./test-create-testdir-4.sh
real 192.48s
user 131.35s
sys  51.53s

After:
$ GNULIB_TOOL_IMPL=py time ./test-create-testdir-4.sh
real 189.47s
user 128.31s
sys  51.64s


2024-04-11  Bruno Haible  <bruno@clisp.org>

        gnulib-tool.py: Optimize module set lookups.
        * gnulib-tool.py (profiler_args): New variable.
        * pygnulib/GLModuleSystem.py (GLModuleTable.transitive_closure): Turn
        handledmodules into a set.
        (GLModuleTable.transitive_closure_separately): For the 'in' test, use
        a set variable main_modules_set.

diff --git a/gnulib-tool.py b/gnulib-tool.py
index bb2837a3d5..cdcd316909 100755
--- a/gnulib-tool.py
+++ b/gnulib-tool.py
@@ -144,4 +144,7 @@ else
   func_fatal_error "python3 not found; try setting GNULIB_TOOL_IMPL=sh"
 fi
 
-exec python3 "$gnulib_dir/.gnulib-tool.py" "$@"
+profiler_args=
+# For profiling, cf. <https://docs.python.org/3/library/profile.html>.
+#profiler_args="-m cProfile -s tottime"
+exec python3 $profiler_args "$gnulib_dir/.gnulib-tool.py" "$@"
diff --git a/pygnulib/GLModuleSystem.py b/pygnulib/GLModuleSystem.py
index 01378259d9..d258fd903f 100644
--- a/pygnulib/GLModuleSystem.py
+++ b/pygnulib/GLModuleSystem.py
@@ -829,7 +829,7 @@ class GLModuleTable:
         # module on the input list has been processed, it is added to the
         # "handled list", so we can avoid to process it again.
         inc_all_tests = self.inc_all_direct_tests
-        handledmodules = []
+        handledmodules = set()
         inmodules = modules
         outmodules = []
         if self.config['conddeps']:
@@ -910,11 +910,11 @@ class GLModuleTable:
                                         self.addConditional(module, depmodule, 
True)
                                     else:  # if not conditional
                                         self.addUnconditional(depmodule)
-            handledmodules = sorted(set(handledmodules + inmodules_this_round))
+            handledmodules = handledmodules.union(inmodules_this_round)
             # Remove handledmodules from inmodules.
-            inmodules = [module
-                         for module in inmodules
-                         if module not in handledmodules]
+            inmodules = [ module
+                          for module in inmodules
+                          if module not in handledmodules ]
             inmodules = sorted(set(inmodules))
             inc_all_tests = self.inc_all_indirect_tests
         modules = sorted(set(outmodules))
@@ -950,10 +950,11 @@ class GLModuleTable:
         main_modules = self.transitive_closure(basemodules)
         self.config.setInclTestCategory(TESTS['tests'], saved_inctests)
         # Determine tests-related module list.
+        main_modules_set = set(main_modules)
         tests_modules = \
             [ m
               for m in finalmodules
-              if not (m in main_modules and m.getApplicability() == 'main') ]
+              if not (m in main_modules_set and m.getApplicability() == 
'main') ]
         # Note: Since main_modules is (hopefully) a subset of finalmodules, 
this
         # ought to be the same as
         #   [ m






reply via email to

[Prev in Thread] Current Thread [Next in Thread]