[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Octave suddenly slow
From: |
John W. Eaton |
Subject: |
Re: Octave suddenly slow |
Date: |
Mon, 17 Nov 2008 22:42:53 -0500 |
On 18-Nov-2008, David Bateman wrote:
| Rob Mahurin wrote:
| > Adding 10^4 empty files to a directory makes octave open a couple
| > seconds slower; adding 10^5 empty files makes it take eight minutes. I
| > have heard of bugs in filesystems that cause this problem, but that
| > doesn't seem to be the case since "ls | wc" is fast in the shell.
| >
| > The filesystem bug I remember (or maybe an ls bug? or both?) was an
| > inefficient sort in the code that lists the names of files in a
| > directory. Poking through the code, I don't see anything like that.
| >
| > Hmmmm ... dir_entry::read() seems to guess that it will live in a small
| > directory. For the case I've outlined above, the attached patch should
| > reduce the number of calls to Array::resize() from ~1000 to ~10. Not
| > tested, though.
|
|
| The conclusive manner to check would be with gprof. But that requires a
| static build with
|
| CFLAGS="-g -pg -O2"
| CXXFLAGS="-g -pg -O2"
| FFLAGS="-g -pg -O2"
| LDFLAGS="-g -pg -O2"
| ./configure --enable-static --disable-dl --disable-shared
| I previously noticed that file_ops::tilde_expand consumed a lot of time
| in gprof at startup.
|
| http://www.nabble.com/forum/ViewPost.jtp?post=12531612&framed=y
|
| So perhaps the attached change to file_ops::tilde_expand might also
| help. For me with this change I see with the following with the command
|
| time octave --eval quit
|
| the wall times are
|
| Normal 1e4 files 1.1e5 files
| With Patch 0.760 1.059 8.414
| Without Patch 1.116 2.801 20.786
|
| So for me the speeds are significantly faster with my patch (note this
| is without any packages installed). I don't think your patch is quite
| right as you start with a zero sized array with len not equal to zero. I
| prefer a patch like
|
| --- a/liboctave/dir-ops.cc
| +++ b/liboctave/dir-ops.cc
| @@ -87,7 +87,7 @@
| {
| if (count >= len)
| {
| - len += grow_size;
| + len = (len == 0 ? grow_size : len * 2);
| dirlist.resize (len);
| }
|
| Adding this and my previous patch gives a startup time of
|
| Normal 1e4 files 1.1e5 files
| With Patches 0.831 1.025 3.516
| Without Patches 1.116 2.801 20.786
|
| So the two patches together give a significant improvement. John added a
| slighly different change to dir_entry::read just now so I only committed
| my change to tilde_expand..
The big performance problem was that Octave would stat each file in
each directory in the load path that had a relative name, and this
would happen every time load_path::update was called (i.e., at each
prompt, call to cd, etc). The stat calls were happening because
load_path::dir_info::update was always calling
load_path::dir_info::initialize for directories with relative names.
That's the conservative thing to do if you don't have any other
information, because a cd could have happened since the last check,
and if it has, the information needs to be updated. The change below
should avoid the unnecessary calls to load_path::dir_info::initialize.
Thanks,
jwe
# HG changeset patch
# User John W. Eaton <address@hidden>
# Date 1226979039 18000
# Node ID c91b59532f32e9223bbb3e34a7a6f0b3219ae137
# Parent c2d126754a49eb79ae679b22fe183946a2c11743
load-path.cc (load_path::dir_info::update): smarter handling of relative names
diff --git a/src/ChangeLog b/src/ChangeLog
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,18 @@
+2008-11-17 John W. Eaton <address@hidden>
+
+ * load-path.h (load_path::dir_info::abs_dir_name): New data member.
+ (load_path::dir_info::dir_info, load_path::dir_info::operator =):
+ Copy abs_dir_name.
+ (load_path::abs_dir_cache_type, load_path::dir_cache_iterator,
+ load_path::const_dir_cache_iterator): New typedefs.
+ (load_path::abs_dir_cache): New static data member.
+ (load_path::dir_info:dir_info): New constructor.
+ * load-path.cc (load_path::abs_dir_cache): Define new static data
+ member.
+ (load_path::dir_info::update): Look in abs_dir_cache for relative
+ directory names.
+ (load_path::dir_info::initialize): Update abs_dir_cache here.
+
2008-11-13 John W. Eaton <address@hidden>
* ov-int8.h, ov-int16.h, ov-int32.h, ov-int64.h, ov-uint8.h,
diff --git a/src/load-path.cc b/src/load-path.cc
--- a/src/load-path.cc
+++ b/src/load-path.cc
@@ -47,26 +47,59 @@
load_path::hook_fcn_ptr load_path::remove_hook = execute_pkg_del;
std::string load_path::command_line_path;
std::string load_path::sys_path;
+load_path::abs_dir_cache_type load_path::abs_dir_cache;
void
load_path::dir_info::update (void)
{
if (is_relative)
- initialize ();
+ {
+ std::string abs_name
+ = octave_env::make_absolute (dir_name, octave_env::getcwd ());
+
+ abs_dir_cache_iterator p = abs_dir_cache.find (abs_name);
+
+ if (p != abs_dir_cache.end ())
+ {
+ // The directory is in the cache of all directories we have
+ // visited (indexed by its absolute name). If it is out of
+ // date, initialize it. Otherwise, copy the info from the
+ // cache. By doing that, we avoid unnecessary calls to stat
+ // that can slow things down tremendously for large
+ // directories.
+
+ const dir_info& di = p->second;
+
+ file_stat fs (dir_name);
+
+ if (fs)
+ {
+ if (fs.mtime () != di.dir_mtime)
+ initialize ();
+ else
+ *this = di;
+
+ return;
+ }
+ else
+ {
+ std::string msg = fs.error ();
+ warning ("load_path: %s: %s", dir_name.c_str (), msg.c_str ());
+ }
+ }
+ }
+
+ file_stat fs (dir_name);
+
+ if (fs)
+ {
+ if (fs.mtime () != dir_mtime)
+ initialize ();
+ }
else
{
- file_stat fs (dir_name);
-
- if (fs)
- {
- if (fs.mtime () != dir_mtime)
- initialize ();
- }
- else
- {
- std::string msg = fs.error ();
- warning ("load_path: %s: %s", dir_name.c_str (), msg.c_str ());
- }
+ std::string msg = fs.error ();
+ warning ("load_path: %s: %s", dir_name.c_str (), msg.c_str ());
}
}
@@ -82,6 +115,11 @@
dir_mtime = fs.mtime ();
get_file_list (dir_name);
+
+ std::string abs_name
+ = octave_env::make_absolute (dir_name, octave_env::getcwd ());
+
+ abs_dir_cache[abs_name] = *this;
}
else
{
diff --git a/src/load-path.h b/src/load-path.h
--- a/src/load-path.h
+++ b/src/load-path.h
@@ -251,10 +251,16 @@
typedef method_file_map_type::const_iterator
const_method_file_map_iterator;
typedef method_file_map_type::iterator method_file_map_iterator;
+ // This default constructor is only provided so we can create a
+ // std::map of dir_info objects. You should not use this
+ // constructor for any other purpose.
+ dir_info (void) { }
+
dir_info (const std::string& d) : dir_name (d) { initialize (); }
dir_info (const dir_info& di)
- : dir_name (di.dir_name), is_relative (di.is_relative),
+ : dir_name (di.dir_name), abs_dir_name (di.abs_dir_name),
+ is_relative (di.is_relative),
dir_mtime (di.dir_mtime), all_files (di.all_files),
fcn_files (di.fcn_files),
private_file_map (di.private_file_map),
@@ -267,6 +273,7 @@
if (&di != this)
{
dir_name = di.dir_name;
+ abs_dir_name = di.abs_dir_name;
is_relative = di.is_relative;
dir_mtime = di.dir_mtime;
all_files = di.all_files;
@@ -281,6 +288,7 @@
void update (void);
std::string dir_name;
+ std::string abs_dir_name;
bool is_relative;
octave_time dir_mtime;
string_vector all_files;
@@ -346,6 +354,11 @@
typedef dir_info_list_type::const_iterator const_dir_info_list_iterator;
typedef dir_info_list_type::iterator dir_info_list_iterator;
+ typedef std::map<std::string, dir_info> abs_dir_cache_type;
+
+ typedef abs_dir_cache_type::const_iterator const_abs_dir_cache_iterator;
+ typedef abs_dir_cache_type::iterator abs_dir_cache_iterator;
+
typedef std::list<file_info> file_info_list_type;
typedef file_info_list_type::const_iterator const_file_info_list_iterator;
@@ -386,6 +399,8 @@
static std::string command_line_path;
static std::string sys_path;
+
+ static abs_dir_cache_type abs_dir_cache;
static bool instance_ok (void);