|
From: | Daniel J Sebald |
Subject: | Re: Slowness in function 'open' |
Date: | Sat, 23 Jun 2007 00:28:51 -0500 |
User-agent: | Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20041020 |
Finally, and in case it is not already obvious, I'd just like to ask everyone who sees bad performance and then thinks "hey, what was jwe smoking when he wrote that code? I'm sure I can do much better than that", to remember that it is not always as simple as it seems at first. I'll admit that the performace is bad in this case and that it could certainly be better (two passes looked like a good idea at the time, but all the work to look for comments and check sizes is duplicated, and that is definitely bad). But perhaps now you see that there are a few extra gotchas that were not immediately obvious and that can have an impact on performace. And we can't just throw out those requirements because then we'll see some other new user complaining that "Octave sucks because it can't read my file but Matlab can".
There might be some alternatives still, that use the existing routine. Probably the time consuming part of reading so much data is translating the formated data. If the matrix data is always complete, i.e., the dimension of the first row tells one the maximum row length, two alternatives might be: 1) Open a temporary file in binary format and on the first pass place the data into the temp file. Reading the binary file on the second pass might be faster. 2) Open the existing file in binary format first and scan--in binary mode--for the number of new-line characters go get the M dimension. Then close/open the file in text mode and scan the first row for N. Rewind, and knowing M x N, off we go. Just run the existing algorithm. I would think the first binary pass would be very quick. Dan
[Prev in Thread] | Current Thread | [Next in Thread] |