[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Octave suddenly slow
From: |
John W. Eaton |
Subject: |
Re: Octave suddenly slow |
Date: |
Mon, 17 Nov 2008 20:30:41 -0500 |
On 17-Nov-2008, Rob Mahurin wrote:
| On Nov 17, 2008, at 3:32 PM, Scott A. McDermott wrote:
| > Michael Goffioul, kudos sir, you came the closest:
| >> Sometimes, this kind of problem is triggered by having a large
| >> number of m-files in the current directory. But it does not look
| >> to be the case for you.
| >
| > Ends up that it's a large number of /any/ file in the current
| > directory
| > induces this problem. Pretty amazing --- even an operation that has
| > absolutely nothing to do with files (like "for x=1:100,x,end") sits
| > there for second after second before starting up, if there are a large
| > number of files in the current directory. (My accidental "test case"
| > had just over 8000 data files.)
|
| I can reproduce this bug using octave 3.0.2 on Mac OSX and Linux.
|
| >> $ time echo "printf('hi\n')" | octave -q
| >> hi
| >> real 0m4.468s
| >>
| >> $ time seq 1e4 | xargs touch # make ten thousand files
| >> real 0m3.579s
| >>
| >> $ time echo "printf('hi\n')" | octave -q
| >> hi
| >> real 0m6.692s
| >>
| >> $ time seq 1e5 | xargs touch # make 100,000 files
| >> real 0m41.446s
| >>
| >> $ time echo "printf('hi\n')" | octave -q
| >> hi
| >> real 8m34.034s
|
|
| Adding 10^4 empty files to a directory makes octave open a couple
| seconds slower; adding 10^5 empty files makes it take eight minutes.
| I have heard of bugs in filesystems that cause this problem, but that
| doesn't seem to be the case since "ls | wc" is fast in the shell.
|
| The filesystem bug I remember (or maybe an ls bug? or both?) was an
| inefficient sort in the code that lists the names of files in a
| directory. Poking through the code, I don't see anything like that.
|
| Hmmmm ... dir_entry::read() seems to guess that it will live in a
| small directory. For the case I've outlined above, the attached
| patch should reduce the number of calls to Array::resize() from ~1000
| to ~10. Not tested, though.
I checked in the following change instead. It seems to improve
things considerably for me. On my system with 3.0.1 your test above
takes about 40 seconds when run in a directory with about 54,000
files. With the current sources and the patch, it takes less than 6
seconds. There is still a penalty any time a directory with a large
number of files is added to the load path because Octave caches the
contents of all directories that are added to the load path. I don't
see a way around that.
jwe
# HG changeset patch
# User John W. Eaton <address@hidden>
# Date 1226970263 18000
# Node ID 545b9f62adcfc8abc2216b75afc3b1e585f6fc2e
# Parent b93ac0586e4bca87c524fceda96600f4df29986c
dir-ops.cc (dir_entry::read): use std::list<std::string> to cache names before
converting to string_vector
diff --git a/liboctave/ChangeLog b/liboctave/ChangeLog
--- a/liboctave/ChangeLog
+++ b/liboctave/ChangeLog
@@ -1,3 +1,8 @@
+2008-11-17 John W. Eaton <address@hidden>
+
+ * dir-ops.cc (dir_entry::read): Use std::list<std::string> to
+ cache names before converting to string_vector.
+
2008-11-14 David Bateman <address@hidden>
* Array2.h (Array2<T> Array2<T>::index): Correct use of
diff --git a/liboctave/dir-ops.cc b/liboctave/dir-ops.cc
--- a/liboctave/dir-ops.cc
+++ b/liboctave/dir-ops.cc
@@ -27,6 +27,9 @@
#include <cerrno>
#include <cstdlib>
#include <cstring>
+
+#include <list>
+#include <string>
#include "sysdir.h"
@@ -69,40 +72,26 @@
string_vector
dir_entry::read (void)
{
- static octave_idx_type grow_size = 100;
-
- octave_idx_type len = 0;
-
- string_vector dirlist;
+ string_vector retval;
if (ok ())
{
- int count = 0;
+ std::list<std::string> dirlist;
struct dirent *dir_ent;
while ((dir_ent = readdir (static_cast<DIR *> (dir))))
{
if (dir_ent)
- {
- if (count >= len)
- {
- len += grow_size;
- dirlist.resize (len);
- }
-
- dirlist[count] = dir_ent->d_name;
-
- count++;
- }
+ dirlist.push_back (dir_ent->d_name);
else
break;
}
- dirlist.resize (count);
+ retval = string_vector (dirlist);
}
- return dirlist;
+ return retval;
}
void