[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: sdiff Enhancement
From: |
Bruno Haible |
Subject: |
Re: sdiff Enhancement |
Date: |
Wed, 28 Nov 2007 16:32:09 +0100 |
User-agent: |
KMail/1.9.1 |
Hi,
Grant Stevens wrote:
> $ diff -W 43 -Y file1 file2 # (-Y = -y option, enhanced for tighter matching)
> line1 line1
> > line2
> close but not exact | close but inexact
> > line3
> > line4
> another not exact | another inexact
> last line last line
Two remarks:
1) I think your "tighter matching" mode would also be useful for unified
diff (diff -u). The problem with unified diffs is when a 30-line chunk
is reindented or otherwise slightly changed, the resulting output is
not understandable any more. Your patch has the potential to fix this.
2) I don't know how you detect similarity between
"close but not exact" and "close but inexact" - your functions
insert_matchup, find_best_match1, figure_score, matching_chars lack
a comment describing what they do -, but a good metrics for this kind
of question is the Levenshtein distance between strings [1], which is
implemented in the fstrcmp() function in gnulib [2].
Bruno
[1] http://en.wikipedia.org/wiki/Levenshtein_distance
[2] http://www.gnu.org/software/gnulib/MODULES.html#module=fstrcmp