lzip-bug
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lzip-bug] Request for help with Windows version of Plzip.


From: wrotycz
Subject: Re: [Lzip-bug] Request for help with Windows version of Plzip.
Date: Sun, 12 Jan 2020 04:21:20 +0100
User-agent: GWP-Draft

I just (re)discovered this thread and few questions arose.


1)
> I guess they don't work, and you have just been lucky until now. But sooner or later the seek of one thread will be executed between the seek and the read of another thread, and corrupt data will result.


I tried before to test pread()/pwite() with lseek()+read() as described here (https://lists.gnu.org/archive/html/lzip-bug/2018-06/msg00010.html) and that's exactly what happened when I tried to use multiple threads. And it wasn't matter of (un)luck because it was happening all the time.


2)
That's why I'm bit surprised by discovery of those "hacked" executables from her with Hannes Domani's patch () and that they work. It made me to actually search a bit about thread safe pread() for windows and it turned out that Cygwin has such implemetation, since 2011, and a thread safe, like the one defined by posix. It definitely is not so trivial as Hannes' hack though (https://cygwin.com/git/gitweb.cgi?p=newlib-cygwin.git;a=blob;f=winsup/cygwin/fhandler_disk_file.cc;h=8f579520691fac976ae70e667447626732c7403d;hb=181fe5d2edacfd1578fe881c857dda7eae9ec104#l1440). If it really is working as they say it would be reasonable to differentiate between hacked for mingw version of pread()/pwrite() and one shipped with Cygwin in this environment/platform. As far as I recollect there were people here who used plzip under cygwin.
So, what I mean is - instead of defining pread()/pwrite() for all windows environments:


#if defined(__MSVCRT__) && defined(WITH_MINGW)
  #include <windows.h>
  #warning "Parallel I/O is not guaranteed to work on Windows."
  ssize_t pread( ... ) { ... }
  ssize_t pwrite( ... ) { ... }
#endif  // __MSVCRT__


distinguish between them: '///' my comments


#if defined (_WIN32) || defined (_WIN64) ///__MSVCRT__ /// not sure if cygwin defines __MSVCRT__
  // windows section
  #if defined (__CYGWIN__) && !defined (WITH_MINGW_HACK)
    // use shipped pread()/pwite()
  #else
    #if defined (WITH_MINGW_HACK)
    ///#if defined (__MINGW__) && defined (WITH_MINGW_HACK) /// maybe there is no point in separating migw here any more if someone wants to use pread_win_hack() with cygwin, that's fine.
      #warning "Parallel I/O is not guaranteed to work on Windows."
      ssize_t pread_win_hack( ... ) { ... }
      ssize_t pwrite_win_hack( ... ) { ... }
      #define pread pread_win_hack
      #define pwrite pwrite_win_hack
    #else
      #error "no functional pread()/pwrite() on this platform"
    #endif // #if defined (__MINGW__)
  #endif // #if defined (__CYGWIN__)
#endif // #if defined (_WIN32)...


3) This is just loosely related but I won't make new thread.
Description of preadblock() in `$plzip/decompress.cc' states 
// Returns the number of bytes really read.
// If (returned value < size) and (errno == 0), means EOF was reached.


which is not accurate. Man for read()(https://linux.die.net/man/3/read) and pread()(https://linux.die.net/man/3/pread) is clear about: non-negative value returned indicates the number of bytes actually read, so it's not an error per se, you should read again or check errno if there was some interruption.
Zero indicates end of file. Only if read()/pread() returns 0 one know it's end of file.
Also -1 returned indicates error and errno is set appropriately. Negative value returned does not always indicate unrecoverable error. It is well "described" in git wrapper xread()(https://repo.or.cz/w/git.git/blob_plain/HEAD:/wrapper.c)




Dnia 22 czerwca 2018 17:04 Antonio Diaz Diaz <address@hidden> napisaƂ(a):

p.z.l wrote:
Now, I don't get why

ssize_t pread (int fd, void *buf, size_t count, __int64 offset)
{
 _lseeki64 (fd, offset, SEEK_SET);
 return read (fd, buf, count);
}

or

ssize_t pread(int fd, void *buf, size_t count, long long offset)
{
(...)
ret = ReadFile(fh, buf, (DWORD)count, &bytes, &o);
(...)
}

would work as they don't preserve file pointer, and are not thread safe either.

I guess they don't work, and you have just been lucky until now. But
sooner or later the seek of one thread will be executed between the seek
and the read of another thread, and corrupt data will result.

Plzip needs a thread safe pread function to guarantee safe operation,
like the one defined by posix:

http://pubs.opengroup.org/onlinepubs/9699919799/functions/pread.html
"The pread() function shall be equivalent to read(), except that it
shall read from a given position in the file without changing the file
offset."


Best regards,
Antonio.

_______________________________________________
Lzip-bug mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/lzip-bug

reply via email to

[Prev in Thread] Current Thread [Next in Thread]