help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Octave suddenly slow


From: John W. Eaton
Subject: Re: Octave suddenly slow
Date: Mon, 17 Nov 2008 22:42:53 -0500

On 18-Nov-2008, David Bateman wrote:

| Rob Mahurin wrote:
| > Adding 10^4 empty files to a directory makes octave open a couple 
| > seconds slower; adding 10^5 empty files makes it take eight minutes.  I 
| > have heard of bugs in filesystems that cause this problem, but that 
| > doesn't seem to be the case since "ls | wc" is fast in the shell.
| > 
| > The filesystem bug I remember (or maybe an ls bug? or both?) was an 
| > inefficient sort in the code that lists the names of files in a 
| > directory.  Poking through the code, I don't see anything like that.
| > 
| > Hmmmm ... dir_entry::read() seems to guess that it will live in a small 
| > directory.  For the case I've outlined above, the attached patch should 
| > reduce the number of calls to Array::resize() from ~1000 to ~10.  Not 
| > tested, though.
| 
| 
| The conclusive manner to check would be with gprof. But that requires a 
|   static build with
| 
| CFLAGS="-g -pg -O2"
| CXXFLAGS="-g -pg -O2"
| FFLAGS="-g -pg -O2"
| LDFLAGS="-g -pg -O2"
| ./configure --enable-static --disable-dl --disable-shared
| I previously noticed that file_ops::tilde_expand consumed a lot of time 
| in gprof at startup.
| 
| http://www.nabble.com/forum/ViewPost.jtp?post=12531612&framed=y
| 
| So perhaps the attached change to file_ops::tilde_expand might also 
| help. For me with this change I see with the following with the command
| 
| time octave --eval quit
| 
| the wall times are
| 
|                  Normal   1e4 files    1.1e5 files
| With Patch       0.760    1.059         8.414
| Without Patch    1.116    2.801        20.786
| 
| So for me the speeds are significantly faster with my patch (note this 
| is without any packages installed). I don't think your patch is quite 
| right as you start with a zero sized array with len not equal to zero. I 
| prefer a patch like
| 
| --- a/liboctave/dir-ops.cc
| +++ b/liboctave/dir-ops.cc
| @@ -87,7 +87,7 @@
|           {
|             if (count >= len)
|               {
| -               len += grow_size;
| +               len = (len == 0 ? grow_size : len * 2);
|                 dirlist.resize (len);
|               }
| 
| Adding this and my previous patch gives a startup time of
| 
|                  Normal   1e4 files    1.1e5 files
| With Patches     0.831    1.025         3.516
| Without Patches  1.116    2.801        20.786
| 
| So the two patches together give a significant improvement. John added a 
| slighly different change to dir_entry::read just now so I only committed 
| my change to tilde_expand..

The big performance problem was that Octave would stat each file in
each directory in the load path that had a relative name, and this
would happen every time load_path::update was called (i.e., at each
prompt, call to cd, etc).  The stat calls were happening because
load_path::dir_info::update was always calling
load_path::dir_info::initialize for directories with relative names.
That's the conservative thing to do if you don't have any other
information, because a cd could have happened since the last check,
and if it has, the information needs to be updated.  The change below
should avoid the unnecessary calls to load_path::dir_info::initialize.

Thanks,

jwe


# HG changeset patch
# User John W. Eaton <address@hidden>
# Date 1226979039 18000
# Node ID c91b59532f32e9223bbb3e34a7a6f0b3219ae137
# Parent  c2d126754a49eb79ae679b22fe183946a2c11743
load-path.cc (load_path::dir_info::update): smarter handling of relative names

diff --git a/src/ChangeLog b/src/ChangeLog
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,18 @@
+2008-11-17  John W. Eaton  <address@hidden>
+
+       * load-path.h (load_path::dir_info::abs_dir_name): New data member.
+       (load_path::dir_info::dir_info, load_path::dir_info::operator =):
+       Copy abs_dir_name.
+       (load_path::abs_dir_cache_type, load_path::dir_cache_iterator,
+       load_path::const_dir_cache_iterator): New typedefs.
+       (load_path::abs_dir_cache): New static data member.
+       (load_path::dir_info:dir_info): New constructor.
+       * load-path.cc (load_path::abs_dir_cache): Define new static data
+       member.
+       (load_path::dir_info::update): Look in abs_dir_cache for relative
+       directory names.
+       (load_path::dir_info::initialize): Update abs_dir_cache here.
+
 2008-11-13  John W. Eaton  <address@hidden>
 
        * ov-int8.h, ov-int16.h, ov-int32.h, ov-int64.h, ov-uint8.h,
diff --git a/src/load-path.cc b/src/load-path.cc
--- a/src/load-path.cc
+++ b/src/load-path.cc
@@ -47,26 +47,59 @@
 load_path::hook_fcn_ptr load_path::remove_hook = execute_pkg_del;
 std::string load_path::command_line_path;
 std::string load_path::sys_path;
+load_path::abs_dir_cache_type load_path::abs_dir_cache;
 
 void
 load_path::dir_info::update (void)
 {
   if (is_relative)
-    initialize ();
+    {
+      std::string abs_name
+       = octave_env::make_absolute (dir_name, octave_env::getcwd ());
+
+      abs_dir_cache_iterator p = abs_dir_cache.find (abs_name);
+
+      if (p != abs_dir_cache.end ())
+       {
+         // The directory is in the cache of all directories we have
+         // visited (indexed by its absolute name).  If it is out of
+         // date, initialize it.  Otherwise, copy the info from the
+         // cache.  By doing that, we avoid unnecessary calls to stat
+         // that can slow things down tremendously for large
+         // directories.
+
+         const dir_info& di = p->second;
+
+         file_stat fs (dir_name);
+
+         if (fs)
+           {
+             if (fs.mtime () != di.dir_mtime)
+               initialize ();
+             else
+               *this = di;
+
+             return;
+           }
+         else
+           {
+             std::string msg = fs.error ();
+             warning ("load_path: %s: %s", dir_name.c_str (), msg.c_str ());
+           }
+       }
+    }
+
+  file_stat fs (dir_name);
+
+  if (fs)
+    {
+      if (fs.mtime () != dir_mtime)
+       initialize ();
+    }
   else
     {
-      file_stat fs (dir_name);
-
-      if (fs)
-       {
-         if (fs.mtime () != dir_mtime)
-           initialize ();
-       }
-      else
-       {
-         std::string msg = fs.error ();
-         warning ("load_path: %s: %s", dir_name.c_str (), msg.c_str ());
-       }
+      std::string msg = fs.error ();
+      warning ("load_path: %s: %s", dir_name.c_str (), msg.c_str ());
     }
 }
 
@@ -82,6 +115,11 @@
       dir_mtime = fs.mtime ();
 
       get_file_list (dir_name);
+
+      std::string abs_name
+       = octave_env::make_absolute (dir_name, octave_env::getcwd ());
+
+      abs_dir_cache[abs_name] = *this;      
     }
   else
     {
diff --git a/src/load-path.h b/src/load-path.h
--- a/src/load-path.h
+++ b/src/load-path.h
@@ -251,10 +251,16 @@
     typedef method_file_map_type::const_iterator 
const_method_file_map_iterator;
     typedef method_file_map_type::iterator method_file_map_iterator;
 
+    // This default constructor is only provided so we can create a
+    // std::map of dir_info objects.  You should not use this
+    // constructor for any other purpose.
+    dir_info (void) { }
+
     dir_info (const std::string& d) : dir_name (d) { initialize (); }
 
     dir_info (const dir_info& di)
-      : dir_name (di.dir_name), is_relative (di.is_relative),
+      : dir_name (di.dir_name), abs_dir_name (di.abs_dir_name),
+       is_relative (di.is_relative),
        dir_mtime (di.dir_mtime), all_files (di.all_files),
        fcn_files (di.fcn_files),
        private_file_map (di.private_file_map),
@@ -267,6 +273,7 @@
       if (&di != this)
        {
          dir_name = di.dir_name;
+         abs_dir_name = di.abs_dir_name;
          is_relative = di.is_relative;
          dir_mtime = di.dir_mtime;
          all_files = di.all_files;
@@ -281,6 +288,7 @@
     void update (void);
 
     std::string dir_name;
+    std::string abs_dir_name;
     bool is_relative;
     octave_time dir_mtime;
     string_vector all_files;
@@ -346,6 +354,11 @@
   typedef dir_info_list_type::const_iterator const_dir_info_list_iterator;
   typedef dir_info_list_type::iterator dir_info_list_iterator;
 
+  typedef std::map<std::string, dir_info> abs_dir_cache_type;
+
+  typedef abs_dir_cache_type::const_iterator const_abs_dir_cache_iterator;
+  typedef abs_dir_cache_type::iterator abs_dir_cache_iterator;
+
   typedef std::list<file_info> file_info_list_type;
 
   typedef file_info_list_type::const_iterator const_file_info_list_iterator;
@@ -386,6 +399,8 @@
   static std::string command_line_path;
 
   static std::string sys_path;
+
+  static abs_dir_cache_type abs_dir_cache;
 
   static bool instance_ok (void);
 

reply via email to

[Prev in Thread] Current Thread [Next in Thread]