help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [octave] file i/o performance


From: Mark Robinson
Subject: Re: [octave] file i/o performance
Date: Tue, 24 May 2005 15:13:48 +1000


John, you misunderstand me, its not a complaint on performance, I just wanted to know what others have experienced. And, I got some helpful advice on how to do simple profiling of code. Some string functions (such as strrep), are particularly slow and one should probably code things differently to avoid it if possible.

For example, here is a summary of my calls:

(MATLAB 7.0.1)
113061 calls to fgetl (total: 3.184000e+01, avg: 2.816179e-04, perc: 0.29) 56530 calls to strrep (total: 2.721000e+01, avg: 4.813373e-04, perc: 0.25) 56530 calls to strtok (total: 4.780000e+01, avg: 8.455687e-04, perc: 0.43) 56530 calls to append (total: 3.680000e+00, avg: 6.509818e-05, perc: 0.03)
Total Time: 1.105300e+02

(Octave 2.1.62)
113061 calls to fgetl (total: 1.072700e+02, avg: 9.487799e-04, perc: 0.05) 56530 calls to strrep (total: 9.456400e+02, avg: 1.672811e-02, perc: 0.48) 56530 calls to strtok (total: 8.430500e+02, avg: 1.491332e-02, perc: 0.43) 56530 calls to append (total: 5.856000e+01, avg: 1.035910e-03, perc: 0.03)
Total Time: 1.954520e+03

Rough estimates are that fgetl, strrep, strtok are 3x, 35x, 17x slower in octave.

So, I guess my question is (and again, it's *not* a complaint), for an average guy like myself, is there any way I can improve upon this? Any suggestions would be most welcome.


Thank you.
Mark


The code itself is below:
--------------------------------------
function [Seq,total] = readFasta(fname)
%
% function Seq = readFasta(fname)
%
%  reads a fasta file, returning a structure Seq that contains all
%  of the sequences in the file.
%
%  Capitalizes all sequences

fid = fopen(fname, 'r');
nseq = 0;
len = 0;
count=zeros(5,1); total=zeros(5,1);
isFirst = 1;
seq = '';
flg = 0;

functions = {'fgetl', 'strrep', 'strtok', 'append'};

while 1

  t1=cputime();
  tline = fgetl(fid);
  total(1) = total(1) + (cputime()-t1);
  count(1)=count(1)+1;

  if ~ischar(tline)
      t2 = cputime();
    Seq.data{nseq} = strrep(seq,'\r','');
      total(2)=total(2)+(cputime()-t2);
              count(2)=count(2)+1;
    break;
  end
  if ~isempty(tline)
    if tline(1) == '>'
      if ~isFirst
          t2=cputime();
        Seq.data{nseq} = strrep(seq,'\r','');
      total(2)=total(2)+(cputime()-t2);
              count(2)=count(2)+1;

      end
      isFirst = 0;
      nseq = nseq+1;
      t3=cputime();
      Seq.rowlabels{nseq} = strtok(tline);
      total(3)=total(3)+(cputime()-t3);
              count(3)=count(3)+1;
      seq = '';
    else
      t4=cputime();
        seq = [seq upper(tline)];
      total(4)=total(4)+(cputime()-t4);
              count(4)=count(4)+1;
    end
  end
end
fid = fclose(fid);
fprintf('\n');

for i=1:4
fprintf('%d calls to %s (total: %e, avg: %e, perc: %3.2f)\n',count(i),functions{i},total(i),total(i)/count(i), total(i)/sum(total));
end
fprintf('Total Time: %e\n',sum(total));
total=sum(total);
--------------------------------------




On 21/05/2005, at 12:31 AM, John W. Eaton wrote:

On 20-May-2005, Mark Robinson wrote:

| Hi.  Newbie question.
|
| I'm a new user of octave (using 2.1.62 on Mac OS X).  I'm trying to
| port over some MATLAB code and am finding reading a text file
| considerably slower in octave than in MATLAB 7.0. Like, about 5 times | slower, which seems to me like it is not an artifact of octave ... it's
| probably the code.
|
| Basically, the program makes a bunch of 'fgetl' calls and, depending on | the content of the line (where it uses 'strrep' or 'strtok'), sticks it
| in one of two cell arrays.  Are any of these known to be slower in
| octave, or is there something to do with cell arrays that I should know
| about?
|
| It's actually a FASTA-formatted file for protein/DNA sequences.

If you think there is a bug in Octave, then please send a *complete*
bug report to the address@hidden list.  If you're not sure what to
include in the report so that someone might actually be able to help
you solve the problem, then please read
http://www.octave.org/bugs.html before sending your report.

It seems unlikely to me that anyone will be able to help you if you
if all you provide is a complaint about performance and a vague
description of your code.

jwe



-------------------------------------------------------------
Octave is freely available under the terms of the GNU GPL.

Octave's home on the web:  http://www.octave.org
How to fund new projects:  http://www.octave.org/funding.html
Subscription information:  http://www.octave.org/archive.html
-------------------------------------------------------------



Mark D. Robinson, Ph.D. Student
Terry Speed Lab, Genetics and Bioinformatics
Walter and Eliza Hall Institute of Medical Research (WEHI)
1G Royal Parade, Parkville Victoria 3050 Australia
+61 3 9345 2324 (voice)
+61 3 9347 0852 (fax)



-------------------------------------------------------------
Octave is freely available under the terms of the GNU GPL.

Octave's home on the web:  http://www.octave.org
How to fund new projects:  http://www.octave.org/funding.html
Subscription information:  http://www.octave.org/archive.html
-------------------------------------------------------------



reply via email to

[Prev in Thread] Current Thread [Next in Thread]