[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#11216: 23.4; parenthesis matching breaks on certain complex expressi
From: |
Nathan Trapuzzano |
Subject: |
bug#11216: 23.4; parenthesis matching breaks on certain complex expressions |
Date: |
Tue, 10 Apr 2012 20:18:47 -0400 |
Here's a complex regular expression that breaks parenthesis matching
(and yes, that's a real regular expression generated from a real perl
program).
M[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*H[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*\=[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*N[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*I[\=\/\\]?[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*N(?![\x21\x27\x2a\x2d\x2f\x3d\x41-\x5a\x5c\x61-\x7a\x7c])
(?<!S\d)(?<!\-\ [@"]\d\
[\x80-\xff])(?<!\-[\x80-\xff][\x80-\xff])(?<!\-[\x80-\xff])(?<![\x27-\x29\x2f\x3d\x41-\x5a\x7c]\*)(?<![\x27-\x29\x2f\x3d\x41-\x5a\x7c])(?:A\)[\/\\]|\*\)[\/\\]A[\=\/\\]?)[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*E[\=\/\\]?[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*I[\=\/\\]?[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*D[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*E[\=\/\\]?(?![\x21\x27\x2a\x2d\x2f\x3d\x41-\x5a\x5c\x61-\x7a\x7c])
(?<!S\d)(?<!\-\ [@"]\d\
[\x80-\xff])(?<!\-[\x80-\xff][\x80-\xff])(?<!\-[\x80-\xff])(?<![\x27-\x29\x2f\x3d\x41-\x5a\x7c]\*)(?<![\x27-\x29\x2f\x3d\x41-\x5a\x7c])Q[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*E[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*A[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*[\/\\]
Matching gets messed up with the open parenthesis immediately following
the first (?<!S\d). I suspect this is due to the opening double-quote
about 10 characters later.
I noticed that the behavior of show-paren-mode changes depending on the
major mode. For example, the behavior described above happens in
fundamental mode, whereas when I switch to text mode, quotation marks
are ignored. However, switching to text mode also causes paren-matching
to ignore back-slashes and thus escaped parentheses/brackets. I think
the best fix would be to enable customization of show-paren-mode so
that the user can specify which characters should be ignored when
matching parentheses.
I've also attached a file containing the regexp in question in case the long
line gets broken up over mail transmission.
regexp.txt
Description: Text document
- bug#11216: 23.4; parenthesis matching breaks on certain complex expressions,
Nathan Trapuzzano <=