help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Loading a large and unusually formatted dataset into an Octave matri


From: Ben Abbott
Subject: Re: Loading a large and unusually formatted dataset into an Octave matrix
Date: Sat, 15 Jun 2013 11:35:32 +0800

On Jun 15, 2013, at 9:25 AM, Ben Abbott wrote:

> On Jun 15, 2013, at 2:04 AM, Elliot Gorokhovsky wrote:
> 
>> Hello! I am a new octave user and I am trying to predict the price of 
>> bitcoins 15 minutes in advance via neural networks for use on the website 
>> btcoracle.com. I have about a gigabyte of data that looks like this:
>> 
>> <Screenshot - 06142013 - 12:08:17 PM.png>
>> 
>> I want to turn it into a matrix with the number of rows equal to the number 
>> of rows of data (i.e. the number of {...}s). I want there to be two columns, 
>> on for price and the other for amount. I don't care about the other stuff, I 
>> want to discard it. 
>> Is there a way to do this (hopefully efficiently)? If so please tell me.
>> 
>> Thank you very much for your time, 
>> Elliot
> 
> The data can be processed nicely using regexp().  I'd be happy to give it a 
> try, but I'll need a short text file (not a graphic) so that I can do some 
> tests.  Can  you attach a short data file (less than 10k bytes)
> 
> Ben

I manually copied the first three lines into a text file and wrote a script to 
(1) read the file, (2) convert to a structure, (3) convert the text for "price" 
and "amount" to double, (4) do a plot.

I didn't bother with regexp() since it looked easier to convert the original 
format into something that Octave could parse.

data = fileread ("bitcoin.txt");
data = strrep (data, ":", ",");
data = strrep (data, "},\n{", ";");
data = eval (data);
for n = size(data,1):-1:1
  s(n) = struct (data{n,:});
endfor
price = cellfun (@str2double, {s.price});
amount = cellfun (@str2double, {s.amount});
plot (price, amount, "-s")

Someone else may suggest a more efficient way.  In particular, I suspect the 
for-loop can be vectorized.

Both the data file and the m-file are attached.

Ben

Attachment: bitcoin.m
Description: Binary data

Attachment: bitcoin.txt
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]