[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] Re: Bug: Plain https links with brackets are not recognised
From: |
Maxim Nikulin |
Subject: |
Re: [PATCH] Re: Bug: Plain https links with brackets are not recognised [9.4.4 (release_9.4.4-625-g763c7a @ /home/yantar92/.emacs.d/straight/build/org/)] |
Date: |
Mon, 15 Mar 2021 18:54:18 +0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 |
On 13/03/2021 12:24, Ihor Radchenko wrote:
Here is the right one
I think, some unit tests would be great to avoid future regressions.
Since it is heuristics, some cases will be always broken, so tests could
clarify intentions.
I have not tested the regexp on real cases, so you have full rights to
be skeptical concerning my comments. I prefer to have explicit URLs in
my files.
+ (let ((non-space-bracket "[^][ \t\n()<>]+"))
I think, "+" is redundant here, it does not allow empty inner
parenthesis and leads to nested construction "(x+)+" that could be
simplified to "x+"
+ ;; Heiristics for an URL link. Source:
Typo: Heuristics
+ ;;
https://daringfireball.net/2010/07/improved_regex_for_matching_urls
+ (rx-to-string
+ `(seq (regexp "\\<")
Isn't it an equivalent to "word-start"?
+ (regexp ,types-re)
+ ":"
+ (1+ (or (regex ,non-space-bracket)
+ (seq "("
+ (* (or (regex ,non-space-bracket)
+ (seq "("
+ (regex ,non-space-bracket)
Maybe "0+" to allow "(())"?
+ ")")))
+ ")")))
+ (or (seq "("
+ (* (or (regex ,non-space-bracket)
+ (seq "("
+ (regex ,non-space-bracket)
+ ")")))
+ ")")
+ (regexp "\\([^[:punct:] \t\n]\\|/\\)")))))
Is the group useful for any purpose? The construction is already inside
an "or".
I have tried to compare current, your, and a little modified your
regexps on several synthetic examples:
#+begin_src elisp
(let*
((types-re (regexp-opt (org-link-types) t))
(org-link-plain-re-ir
(let ((non-space-bracket "[^][ \t\n()<>]+"))
;; Heiristics for an URL link. Source:
;; https://daringfireball.net/2010/07/improved_regex_for_matching_urls
(rx-to-string
`(seq (regexp "\\<")
(regexp ,types-re)
":"
(1+ (or (regex ,non-space-bracket)
(seq "("
(* (or (regex ,non-space-bracket)
(seq "("
(regex ,non-space-bracket)
")")))
")")))
(or (seq "("
(* (or (regex ,non-space-bracket)
(seq "("
(regex ,non-space-bracket)
")")))
")")
(regexp "\\([^[:punct:] \t\n]\\|/\\)"))))))
(org-link-plain-re-nm
(let* ((non-space-bracket "[^][ \t\n()<>]")
(parenthesis
`(seq "("
(0+ (or (regex ,non-space-bracket)
(seq "("
(0+ (regex ,non-space-bracket))
")")))
")")))
;; Heuristics for an URL link inspired by
;; https://daringfireball.net/2010/07/improved_regex_for_matching_urls
(rx-to-string
`(seq word-start
(regexp ,types-re)
":"
(1+ (or (regex ,non-space-bracket)
,parenthesis))
(or (regexp "[^[:punct:] \t\n]")
?/
,parenthesis))))))
(cons
(list "src" "orig" "ir" "nm")
(mapcar (lambda (s)
(cons
s
(mapcar (lambda (r)
(if (string-match r s) (match-string 0 s) ""))
(list org-link-plain-re org-link-plain-re-ir
org-link-plain-re-nm))))
(list
"The file:a link"
"The file:aa link"
"The file:a(b)c link"
"The file:a() link"
"The file:aa((a)) link"
"The file:aa(()) link"
"The file:///a link"
"The file:///a/, link"
"The http:// link"
"The (some file:ab) link"
"The file:aa) link"
"The file:aa( link"
))))
#+end_src
#+RESULTS:
| src | orig | ir | nm |
| The file:a link | | | |
| The file:aa link | file:aa | file:aa | file:aa |
| The file:a(b)c link | file:a(b) | file:a(b)c | file:a(b)c |
| The file:a() link | | file:a() | file:a() |
| The file:aa((a)) link | file:aa | file:aa((a)) | file:aa((a)) |
| The file:aa(()) link | file:aa | file:aa | file:aa(()) |
| The file:/a link | file:/a | file:/a | file:/a |
| The file:/a/, link | file:/a/ | file:/a/ | file:/a/ |
| The http:// link | http:// | http:// | http:// |
| The (some file:ab) link | file:ab | file:ab | file:ab |
| The file:aa) link | file:aa | file:aa | file:aa |
| The file:aa( link | file:aa | file:aa | file:aa |
- Bug: Plain https links with brackets are not recognised [9.4.4 (release_9.4.4-625-g763c7a @ /home/yantar92/.emacs.d/straight/build/org/)], Ihor Radchenko, 2021/03/12
- Re: Bug: Plain https links with brackets are not recognised [9.4.4 (release_9.4.4-625-g763c7a @ /home/yantar92/.emacs.d/straight/build/org/)], Kyle Meyer, 2021/03/12
- [PATCH] Re: Bug: Plain https links with brackets are not recognised [9.4.4 (release_9.4.4-625-g763c7a @ /home/yantar92/.emacs.d/straight/build/org/)], Ihor Radchenko, 2021/03/13
- Re: [PATCH] Re: Bug: Plain https links with brackets are not recognised [9.4.4 (release_9.4.4-625-g763c7a @ /home/yantar92/.emacs.d/straight/build/org/)], Ihor Radchenko, 2021/03/13
- Re: [PATCH] Re: Bug: Plain https links with brackets are not recognised [9.4.4 (release_9.4.4-625-g763c7a @ /home/yantar92/.emacs.d/straight/build/org/)],
Maxim Nikulin <=
- Re: [PATCH] Re: Bug: Plain https links with brackets are not recognised [9.4.4 (release_9.4.4-625-g763c7a @ /home/yantar92/.emacs.d/straight/build/org/)], Ihor Radchenko, 2021/03/16
- Re: [PATCH] Re: Bug: Plain https links with brackets are not recognised [9.4.4 (release_9.4.4-625-g763c7a @ /home/yantar92/.emacs.d/straight/build/org/)], Maxim Nikulin, 2021/03/17
- Re: [PATCH] Re: Bug: Plain https links with brackets are not recognised [9.4.4 (release_9.4.4-625-g763c7a @ /home/yantar92/.emacs.d/straight/build/org/)], Ihor Radchenko, 2021/03/19
- Re: [PATCH] Re: Bug: Plain https links with brackets are not recognised [9.4.4 (release_9.4.4-625-g763c7a @ /home/yantar92/.emacs.d/straight/build/org/)], Maxim Nikulin, 2021/03/24
- Re: [PATCH] Re: Bug: Plain https links with brackets are not recognised [9.4.4 (release_9.4.4-625-g763c7a @ /home/yantar92/.emacs.d/straight/build/org/)], Ihor Radchenko, 2021/03/24
- Re: [PATCH] Re: Bug: Plain https links with brackets are not recognised [9.4.4 (release_9.4.4-625-g763c7a @ /home/yantar92/.emacs.d/straight/build/org/)], Maxim Nikulin, 2021/03/19
- Re: [PATCH] Re: Bug: Plain https links with brackets are not recognised [9.4.4 (release_9.4.4-625-g763c7a @ /home/yantar92/.emacs.d/straight/build/org/)], Ihor Radchenko, 2021/03/19
Re: Bug: Plain https links with brackets are not recognised [9.4.4 (release_9.4.4-625-g763c7a @ /home/yantar92/.emacs.d/straight/build/org/)], Nicolas Goaziou, 2021/03/24