bug-global
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Binary recognition is to narrow [new suggestion]


From: Jason Hood
Subject: Re: Binary recognition is to narrow [new suggestion]
Date: Sat, 21 Nov 2009 16:33:53 +1000
User-agent: Thunderbird 2.0.0.23 (Windows/20090812)

Erik Jonsson wrote:
Instead of counting characters over 127 the only test is that the first
511 bytes don't contain any of the controll characters 0-8, 14-31. No
normal textfile would contain these.

No normal source file, but if you want to generalise to
text files, 8 (backspace) and 27 (escape) could probably
occur (man files being a prime example).

One of the benefits is that this will correctly tag files in uni-code as
text as well. Since those control characters never appears in uni-code
either.

I guess you mean UTF8, since UTF16/32 would most likely have
a few 0s.

Jason.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]