|
From: | Przemek Klosowski |
Subject: | Re: memory exhausted when reading 129M file |
Date: | Tue, 14 Aug 2012 11:12:41 -0400 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 |
On 08/14/2012 10:01 AM, Zheng, Xin (NIH) [C] wrote:
Thank you! Your idea works even without preallocating memory. The whole data would occupy ~100M (based on 4-byte int and 1-byte char). I have no idea about Octave internals. In Matlab, the cell data size would be 1G.
Glad it works. Actually, you are getting 8-byte double numbers, and your strings are around 9 chars, so indeed the total size should be around 7M * (8+9) or 120MB ---funny how close it is to the formatted size of your disk file. I am not sure how Matlab gets it to 1GB: do they use Unicode 4-byte characters and 16-byte complex numbers? even then, it would just be 360MB.
So it seems that there is some room for 'textscan' in Octave to be improved. Same thing for 'textread'. Though 'dlmread' in Octave works fast and great in reading the same file except it replaces all strings to 0.
As you say it seems that textscan() may have a memory management problem for longer format strings, although I can't see why as it just seems to call strread(). Ben Abbott apparently wrote textscan(), and Philip Nienhuis is the last one to work on strread(), so let's see if they can think of something.
[Prev in Thread] | Current Thread | [Next in Thread] |