help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to load data files with mixed str & numerical data with headers


From: Philip Nienhuis
Subject: Re: How to load data files with mixed str & numerical data with headers "*.txt.gz" or "*.xml.tgz"
Date: Mon, 20 Aug 2012 12:47:53 -0700 (PDT)

Shoumei wrote
> 
> I am a new learner of octave. I am trying to process large data files. I
> used to unzip them and open with excel, then count the columns &rows and
> loaded with "textread". I have to define the format with "%s " or "&f" for
> every single column. It becomes slower and harder to do this way with a
> file of e.g 49455x280. I tried to directly load them into octave but I got
> the error "inconsistent number of columns near line 2". Suggestions are
> anxiously needed.
> I have octave 3.6.2 with pkg io,java and image installed.
> 

Sorry, I don't fully understand.

What format are your data files (after unzipping)?
- Excel .xls or .xlsx files?
- plain text files with numeric and text columns?

If they are Excel files, you can obviously read them directly. But I suppose
you have text files.
Once you have got them imported into Excel anyway, why don't you simply:
- save them from Excel into .xls and use xlsread or
xlsopen-xls2oct-parsecell-xlsclose, or:
- save them from Excel into .csv and use csv2cell (optionally followed by
parsecell to separate the numerical and text data).
Or are they too big? 50,000 X 300 is easily loaded into e.g., recent Excel
and LibreOffice versions (capacity 10^6 rows by 1024 columns).

You can also try dlmread, if the file contents are simple and you don't care
for the text contents. 

Anyway, textread is a slow and cumbersome way to read simple data files with
many columns. Its main use is that it does come in handy if you need to read
complicated and ugly text files.

BTW somewhere at work I must have a quick-and-dirty Matlab/Octave function
script that "explores" text files of the kind you probably refer to: simple
with many columns. It returns a format string for use in
strread/textread/textscan (optionally after skipping some user-defined
number of header lines, or I might have finished to automate that as well, I
don't remember). It was meant to further enhance textscan/strread/textread
and simplify their use, but that mission somehow got swamped.
If you really want I can try to dig it up, but it can take until next week
or later before I have an opportunity to search for it.

Philip




--
View this message in context: 
http://octave.1599824.n4.nabble.com/How-to-load-data-files-with-mixed-str-numerical-data-with-headers-txt-gz-or-xml-tgz-tp4642969p4642992.html
Sent from the Octave - General mailing list archive at Nabble.com.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]