bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#65496: 30.0.50; Issue with the regexp used to auto-detect PBM image


From: David Ponce
Subject: bug#65496: 30.0.50; Issue with the regexp used to auto-detect PBM image data
Date: Mon, 4 Sep 2023 18:32:22 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.0

On 24/08/2023 12:55, David Ponce wrote:
Hello,

While experimenting with code to create image from data, I encountered
an issue with the regexp in `image-type-header-regexps' used to
auto-detect PBM image type from the first bytes of image data. That is:

"\\`P[1-6]\\(?:\
\\(?:\\(?:#[^\r\n]*[\r\n]\\)*[[:space:]]\\)+\
\\(?:\\(?:#[^\r\n]*[\r\n]\\)*[0-9]\\)+\
\\)\\{2\\}"

Here is a simple recipe to illustrate the issue:

In *scratch* buffer eval:
-------------------------
;; Get content of a pbm file.
(setq test-data
       (with-current-buffer
           (find-file-noselect "[YourEmacsPath]/etc/images/splash.pbm")
         (prog1 (buffer-substring-no-properties (point-min) (point-max))
           (kill-buffer (current-buffer)))))

;; Check string data fail for pbm image-type!
(image-type-from-data test-data)
nil
;; With a temp buffer current, the same test works!
(with-temp-buffer
  (image-type-from-data test-data))
pbm
-------------------------

After further digging, I found that the problem might be due to the use
of the [:space:] character class whose meaning, according to the manual,
depends on the syntax of whitespace characters setup in current buffer.
So, using discrete values in place of syntax class seems to solve the
issue:

(setcar (nth 1 image-type-header-regexps)
         "\\`P[1-6]\\(?:\
\\(?:\\(?:#[^\r\n]*[\r\n]\\)*[ \t\r\n]\\)+\
\\(?:\\(?:#[^\r\n]*[\r\n]\\)*[0-9]\\)+\
\\)\\{2\\}")

(image-type-from-data test-data)
pbm

I attached a patch proposal.
Hope it will help.
Regards

Some additions.

Basic string matching recipe:

In *scratch* buffer eval:
-------------------------

(let ((re "\\`P[1-6]\\(?:\
\\(?:\\(?:#[^\r\n]*[\r\n]\\)*[[:space:]]\\)+\
\\(?:\\(?:#[^\r\n]*[\r\n]\\)*[0-9]\\)+\
\\)\\{2\\}")
      (text "P4
333 233"))
  (string-match-p re text))
nil

(with-syntax-table (standard-syntax-table)
  (let ((re "\\`P[1-6]\\(?:\
\\(?:\\(?:#[^\r\n]*[\r\n]\\)*[[:space:]]\\)+\
\\(?:\\(?:#[^\r\n]*[\r\n]\\)*[0-9]\\)+\
\\)\\{2\\}")
        (text "P4
333 233"))
    (string-match-p re text)))
0

I wonder if it is expected that matching a regular expression against a string
object depends on the syntax-table setup in current buffer?
Shouldn't (standard-syntax-table) implied when matching a regexp against a 
string
object, that is, regardless of any buffer context?

Regards





reply via email to

[Prev in Thread] Current Thread [Next in Thread]