bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#13949: 24.4.1; `fill-paragraph' should not always put the buffer as


From: Óscar Fuentes
Subject: bug#13949: 24.4.1; `fill-paragraph' should not always put the buffer as modified
Date: Sun, 27 Mar 2016 23:05:42 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.92 (gnu/linux)

Dmitry Gutov <dgutov@yandex.ru> writes:

> On 03/27/2016 06:53 PM, Lars Magne Ingebrigtsen wrote:
>
>> I look forward to seeing you visiting the git mailing list and starting
>> to agitate for not using sha-1 hashes as object identifiers in git,
>> because it might obviously lose data if you happen to get collisions.
>
> a) Why didn't they use md5, I wonder?

AFAIK, at first they intended to use the hash as a method for avoiding
malicious tampering of the VC contents, and MD5 was already broken as a
crypto hash algorithm. (It is entirely different to find a collision by
chance and to *fabricate* a collision; MD5 is broken for the later, but
reliable for preventing the former.)

I guess that the extra bits of entropy (160 vs 128) was a "fuzzy-warm"
factor too on using SHA-1 instead of MD5. Git must avoid collisions
among potentially hundreds of millions of objects (repos with that size
already exists or will exist on the near future.) Each and every hash
must be different from all the others and hence avoid the Birthday
Problem. Anyway, 128 bit hashes still would be good enough for those
huge repos. fill-paragraph needs to discriminate only between 2 chunks
of data.

> b) Git has a global object index. It _can_ detect collisions, or at
> least that detection can be implemented.

And what to do when a collision is detected?

Back to the topic, your suggetion about comparing the pre- and post-
contents of the paragraph (and avoiding huge copies of the pre- contents
by restricting the copied area to the paragraph itself) does not work
when the file contains just one paragraph. Try visiting a big CSV dump
or log and press M-q. You can abort the operation with C-g, but if Emacs
starts to swap like crazy or exceeds the process memory limit and it is
killed... We can be confident that this would happen multiple times out
there, the contrary of having the same MD5 for the pre- and post- result
of fill-paragraph.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]