wget-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Wget-dev] wget2 | Decompressor: Handle (trailing) garbage gracefully (a


From: Tim Rühsen
Subject: [Wget-dev] wget2 | Decompressor: Handle (trailing) garbage gracefully (add tests) (#481)
Date: Sat, 21 Sep 2019 10:40:32 +0000


Tim Rühsen created an issue: https://gitlab.com/gnuwget/wget2/issues/481



>From Antonio Diaz Diaz:
```
For lzip it is easy to check the whole "transaction" (formed by one or more 
successive calls to 'wget_decompress'), by calling 'LZ_decompress_finish' (+ 
read loop) in the 'lzip_exit' function, just as you have already done. (The 
read loop should not read any data, or else the data produced by 
'lzip_decompress' was not complete).

For gzip the change should be something similar; calling 'inflate' with 
Z_FINISH (+ read loop) in the 'gzip_exit' function.

Trailing garbage may cause problems when the decompression library is used to 
check the data. I don't know if trailing garbage happens in the kind of 
compressed data managed by Wget, but in the case of lzlib it can be easily 
ignored by ignoring the 'LZ_header_error' error code in 'lzip_exit'.

http://www.nongnu.org/lzip/manual/lzlib_manual.html#Error-codes
-- Constant: enum LZ_Errno LZ_header_error
    An invalid member header (one with the wrong magic bytes) was read. If this 
happens at the end of the data stream it may indicate trailing data.

> What would be helpful is a bunch of compressed input files (ok and with
> typical issues, e.g. wrong CRC) plus an expected result / result
> checksum. So developers can test their code against it (you can also use
> it for your test suite).

>From the lzip documentation[1] it is trivial to create files with any kind of 
>problem. I use some of them, but unzcrash can test thousands of corrupt files 
>quickly.

[1] http://www.nongnu.org/lzip/manual/lzlib_manual.html#Data-format

One feature unique to the lzip format is that it provides 3 factor integrity 
checking and the decompressors report mismatches in each factor separately:

$ lzip -cd bad_fox.lz
The quick brown fox jumps over the lazy dog.
  bad_fox.lz: CRC mismatch; stored EB50CC4A, computed EB50CC6A
Data size mismatch; stored 44 (0x2C), computed 45 (0x2D)
Member size mismatch; stored 81 (0x51), computed 80 (0x50)

If you need specially crafted (corrupted or not) lzip files for your testing, 
just tell me and I'll make them for you.

```

-- 
Reply to this email directly or view it on GitLab: 
https://gitlab.com/gnuwget/wget2/issues/481
You're receiving this email because of your account on gitlab.com.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]