[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: reading text data with textscan annoyingly slow
From: |
Ben Abbott |
Subject: |
Re: reading text data with textscan annoyingly slow |
Date: |
Wed, 02 Nov 2011 08:35:08 -0400 |
On Nov 2, 2011, at 6:40 AM, MarcelK wrote:
> http://octave.1599824.n4.nabble.com/file/n3972499/example1.ncf example1.ncf
>
> Hi,
>
> I'm using Octave 3.2.4. with Windows XP. (i686-pc-mingw32)
> I'm also using GUIOctave 1.5.3. as frontend.
>
> I'm facing some problems reading data from a .ncf text file.
> I've attached an example of such a file (I hope that worked).
> It takes about 60 seconds to read one single ncf file
> However, in Matlab it takes not even a second.
>
> Here's my code I use to read the data in:
>
>
> function [Date1,headlines,nummatrix] = ncfread (filename)
>
> fid=fopen(filename,'r');
>
> %# read data headers
> headerdata=fgets(fid);
> index=findstr(headerdata,'}');
> ncols=length(index);
> headlines={};
> headlines(1)=headerdata(1:index(1));
> for mm=2:ncols
> headlines(mm)=headerdata(index(mm-1)+1:index(mm));
> endfor
>
> textformat=['%s %s',repmat('%f',1,ncols-2)];
>
> datacell=textscan(fid,textformat);
>
> Date1=datacell{1,1}{1};
>
>
> timedata=datacell{2};
>
> fclose(fid);
>
> %# generate time vector (time in hours)
> t=zeros(size(datacell,1),1);
> timestring=char(timedata);
> for jj=1:size(timestring,1)
> tstruct=strptime(timestring(jj,:),'%R');
> t(jj)=tstruct.hour+tstruct.min/60;
> endfor
>
> %# conversion cell>matrix
> nummatrix=zeros(length(datacell{1}),size(datacell,2));
> nummatrix(:,2)=t;
>
> for ii=3:size(nummatrix,2)
> nummatrix(:,ii)=datacell{ii};
> endfor
>
> nummatrix(:,1)=[];
>
> endfunction
>
>
> My way of converting the "time string" (e.g. '10:00') to time in hours
> (e.g. 10.00) seems quite complicated to me, is there maybe a better way to
> achieve this?
>
> Thanks in advance,
>
> Marcel
Octave's textscan() is currently implemented as an m-file, while Matlab's has
been written in c++. I expect large differences in speed. The developers are
planning to implement Octave's textscan() in c++ as well. I'm optimistic the
result will be very fast.
Even so, I am able to run your script is about 1 sec.
tic (); ncfread ('example1.ncf'); toc()
Elapsed time is 1 seconds.
I'm running the developer's sources on MacOS, so it is possible that Octave's
textread() has been improved or the slow performance is due to some problem
between Octave and Windows.
I don't have an older copy of Octave to try, nor do I have a windows machine to
work with.
Anyone else?
Ben