Re: imread on large tiff

help-octave

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: imread on large tiff

From:	Carnë Draug
Subject:	Re: imread on large tiff
Date:	Fri, 17 Jan 2014 12:02:50 +0000

On 16 January 2014 23:50, John Hayes <address@hidden> wrote:
> Le 17 janv. 2014 à 00:01, Carnë Draug a écrit :
>
>> Please always include the mailing list when replying so others can
>> read it in the future or chime in to give further help and advice.
> D’oh, I thought I hit reply-all, and I usually check that but clearly didn’t 
> in this case. Apologies all around...
>
>> On 16 January 2014 22:25, John Hayes <address@hidden> wrote:
>>> Le 16 janv. 2014 à 19:17, Carnë Draug a écrit :
>>>>
>>>> [...]
>>>>
>>>> You mention using imfinfo. I'm assuming you're doing this to get the
>>>> number of rows, columns and frames in your image. imfinfo will return
>>>> a struct array with a lot of fields for each frame in your file. Note
>>>> that it is possible for each page on your TIFF to have different info,
>>>> even different size, we can't just deduce all that from the first
>>>> page. And that is slow.
>>>
>>> Yes, I’ve decided to bypass it altogether and just use my known values for 
>>> the dimensions since imfinfo seems to read the whole file just like imread 
>>> in ./libinterp/dldfcn/__magick_read__.cc.  That is the ‘read_file’ function 
>>> is unnecessarily called MANY times for me, but from the discussion it 
>>> doesn’t seem like ImageMagick/GraphicsMagick really supports a better 
>>> mechanism (and this approach sounds schlocky at best: 
>>> http://www.imagemagick.org/discourse-server/viewtopic.php?f=1&t=13439). So, 
>>> your suggestion has been a great workaround.
>>
>> It's not a workaround, it's the documented usage (not in Matlab which
>> only supports this for gif files). You will have a problem if the
>> pages are not all equal in size and bit depth though.
>>
>> On the defense of GraphicsMagick, there's not much that they can do.
>> It's just how the TIFF format works, you get the pointer for the each
>> page at the end of the previous one. What Matlab did (according to
>> their documentation) is to read the whole image once with imfinfo
>> which returns an array with the start location of each page in the
>> file. This can be passed as an extra argument to imread so it knows
>> where to start.
>>
> OK, Matlab may not document this well. But the code I was using was from 
> someone that was using Matlab on Linux (I don’t know the version off-hand, 
> but a recent one). I had noticed that Matlab’s documentation was very 
> unspecific in my original usage for a TIF file (but that a Linux Matlab-user 
> devised) so something must be in flux over there (or Mathworks' online docs 
> are simply out of sync with the reality of their recent versions).
>
> The coding in Octave seems to be very defensive, which I agree with is good 
> especially for the variety of formats these functions are intended to support.
>
> But realistically, I wonder, who has a .tif or .gif with multiple depths and 
> frame dimensions within a single file and want to use it with this function?

I don't think a gif file allows that (probably why Matlab only allows
the "Frames" "all" option in gif's.

> That just sounds very bizarre to me and a VERY weird special case that >99% 
> of users of imread would never have need of. It’s not a complaint on the 
> implementation, more of a complaint if Matlab actually can handle this 
> because it sounds crazy to me.
>
> I’m sorry, I don’t mean any disrespect to anyone because maybe some people 
> find this useful that I’m not aware of (but I would be interested in); it 
> just seems to me like if the dimensions change frame-to-frame one should 
> change the file it’s stored in. I’m racking my brain on this, but the only 
> example I can think of someone that would find this useful is if someone was 
> converting a presentation to .pdf, then converting a .pdf to .tif, where each 
> slide may be slightly different dimensions. But even that sounds like a bad 
> idea to me... Maybe that’s a flaw with the TIFF format though that I was 
> previously unaware of...

That's not a flaw or bizarre at all. The TIFF file format is very
flexible. And I have many TIFFs like that. Some microscope systems
save a file with the original microscope data intercalated with a
thumbnail to offer a quick overview of the saved images. I reach such
images with data = imread (filepath, "Index", 1:2:numel(imfinfo
(filepath))).  Also, Octave will read PDF's with imread, where pages
with different sizes are very common.

> Personally, I still think the fundamental problem is with GraphicsMagick++ as 
> this link seems to indicate they (or the original ImageMagick++) have the 
> facility for accessing individual or range of frames/pages: 
> http://www.imagemagick.org/discourse-server/viewtopic.php?f=1&t=13439
> It seems they don’t have an easily accessible API function for it though 
> which busts the usability for Octave... And presumably that doesn’t extend to 
> other formats, which I further presume is the principal reason for using 
> GraphicsMagick++ to begin with for us!
>
> If others agree, I’ll hop on the GraphicsMagick mailing list and inquire 
> about the lack of access through the API problem (I think) we’re having. And 
> hopefully work towards it as best I can....

You're reading this wrong. GraphicsMagick (GM) allows you to read only
the page #8. But the problem is that given a file, GM does not know
where page #8 starts in the file, that information is stored at the
end of page #7. But the start of page #7 is located at the end of page
#6. That's why the file needs to be read from the start each time.
This is not a limitation of GM, it's just how the TIFF format is,
there is no way around it. I'd suggest you read the TIFF
specifications [1] or the comments on the tiff_tag_read function [2]
which have a short resume of them.

If we were using libtiff (which we won't because we don't want to deal
with one implementation for each format like Matlab does), we could
read the whole file once to find the start location of each page,
store the answer, and known where to start the reading from in the
future. If GM would implement such method (since GM uses libtiff under
the hood) we'd surely make use of it. But in truth, I predict that in
the future we will have to replace GM with something else [3, 4].

> But, as I said, I mean no disrespect to the implementers of Octave, because 
> the imread function looks like a beast to implement for many file formats 
> fairly; I’m just curious about it and suspect that Mathworks does a lot of 
> shady stuff « just to make it work » in special cases (for their paying 
> customers I’m sure :)...

By the way, writing of multipage tiffs should work the same way,
simply give a 4d matrix to imwrite(). The for loop usage that is
documented in Matlab should also work but will be slower.

Carnë

[1] http://partners.adobe.com/asn/developer/PDFS/TN/TIFF6.pd
[2] https://sourceforge.net/p/octave/image/ci/default/tree/inst/tiff_tag_read.m
[3] http://carandraug.no-ip.org/blog/?p=28
[4] http://carandraug.no-ip.org/blog/?p=37

[Prev in Thread]

Current Thread

[Next in Thread]

imread on large tiff, John Hayes, 2014/01/15
- Re: imread on large tiff, Carnë Draug, 2014/01/16
  - Message not available
    - Message not available
    - Re: imread on large tiff, John Hayes, 2014/01/16
    - Re: imread on large tiff, Carnë Draug <=

Prev by Date: griddata problem
Next by Date: Error: default graphics toolkit 'gnuplot' is not available!
Previous by thread: Re: imread on large tiff
Next by thread: Oct files and single precision float arrays
Index(es):
- Date
- Thread