[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Loading a large and unusually formatted dataset into an Octave matri
From: |
Ben Abbott |
Subject: |
Re: Loading a large and unusually formatted dataset into an Octave matrix |
Date: |
Sat, 15 Jun 2013 11:35:32 +0800 |
On Jun 15, 2013, at 9:25 AM, Ben Abbott wrote:
> On Jun 15, 2013, at 2:04 AM, Elliot Gorokhovsky wrote:
>
>> Hello! I am a new octave user and I am trying to predict the price of
>> bitcoins 15 minutes in advance via neural networks for use on the website
>> btcoracle.com. I have about a gigabyte of data that looks like this:
>>
>> <Screenshot - 06142013 - 12:08:17 PM.png>
>>
>> I want to turn it into a matrix with the number of rows equal to the number
>> of rows of data (i.e. the number of {...}s). I want there to be two columns,
>> on for price and the other for amount. I don't care about the other stuff, I
>> want to discard it.
>> Is there a way to do this (hopefully efficiently)? If so please tell me.
>>
>> Thank you very much for your time,
>> Elliot
>
> The data can be processed nicely using regexp(). I'd be happy to give it a
> try, but I'll need a short text file (not a graphic) so that I can do some
> tests. Can you attach a short data file (less than 10k bytes)
>
> Ben
I manually copied the first three lines into a text file and wrote a script to
(1) read the file, (2) convert to a structure, (3) convert the text for "price"
and "amount" to double, (4) do a plot.
I didn't bother with regexp() since it looked easier to convert the original
format into something that Octave could parse.
data = fileread ("bitcoin.txt");
data = strrep (data, ":", ",");
data = strrep (data, "},\n{", ";");
data = eval (data);
for n = size(data,1):-1:1
s(n) = struct (data{n,:});
endfor
price = cellfun (@str2double, {s.price});
amount = cellfun (@str2double, {s.amount});
plot (price, amount, "-s")
Someone else may suggest a more efficient way. In particular, I suspect the
for-loop can be vectorized.
Both the data file and the m-file are attached.
Ben
bitcoin.m
Description: Binary data
bitcoin.txt
Description: Text document