[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#64939: 30.0.50; The default auto-mode-interpreter-regexp does not ma
From: |
Kévin Le Gouguec |
Subject: |
bug#64939: 30.0.50; The default auto-mode-interpreter-regexp does not match env with flags |
Date: |
Sat, 10 Feb 2024 11:23:01 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) |
Thanks for the CC, this report had completely slipped past my notice
when I worked on bug#66902, and so did Malcolm's follow-ups.
Boldly adding Wilhelm as well, since I am not 100% sure Debbugs sends a
copy of every message in a report to their OP.
Comments below.
Eli Zaretskii <eliz@gnu.org> writes:
>> From: Malcolm Cook <malcolm.cook@gmail.com>
>> Date: Thu, 1 Feb 2024 12:52:39 -0600
>>
>> Regarding [1] allowing emacs to recognize shebang lines containing
>> calls to /bin/env with options (such as -S as allowed in new core
>> utils [2])...
>>
>> I prefer allowing the proposed "shy" regexp to match zero or more
>> times (using a '*' instead of '?').
>>
>> To wit, I have this now in my init.el:
>>
>> (setq auto-mode-interpreter-regexp
>> ;; Support shbang line calling `/bin/env` with `-S` (and/or
>> other options).
>> ;; c.f. https://debbugs.gnu.org/cgi/bugreport.cgi?bug=64939
>> (purecopy "#![ \t]?\\([^ \t\n]*\
>> /bin/env[ \t]\\)?\\(?:-\\{1,2\\}[a-zA-Z1-9=]+[ \t]+\\)*\\([^
>> \t\n]+\\)"))
IIUC this would be a more lax variant of what we installed for
bug#66902, can you confirm Malcolm? This is what the current regexp
looks like on the master branch:
(purecopy
(concat
"#![ \t]*"
;; Optional group 1: env(1) invocation.
"\\("
"[^ \t\n]*/bin/env[ \t]*"
"\\(?:-S[ \t]*\\|--split-string\\(?:=\\|[ \t]*\\)\\)?"
"\\)?"
;; Group 2: interpreter.
"\\([^ \t\n]+\\)"))
And the corresponding test cases:
(ert-deftest files-tests-auto-mode-interpreter ()
"Test that `set-auto-mode' deduces correct modes from shebangs."
(files-tests--check-shebang "#!/bin/bash" 'sh-mode)
(files-tests--check-shebang "#!/usr/bin/env bash" 'sh-mode)
(files-tests--check-shebang "#!/usr/bin/env python" 'python-base-mode)
(files-tests--check-shebang "#!/usr/bin/env python3" 'python-base-mode)
(files-tests--check-shebang "#!/usr/bin/env -S awk -v FS=\"\\t\" -v
OFS=\"\\t\" -f" 'awk-mode)
(files-tests--check-shebang "#!/usr/bin/env -S make -f" 'makefile-mode)
(files-tests--check-shebang "#!/usr/bin/make -f" 'makefile-mode))
Is this Good Enough™ for your purposes (Malcolm, Wilhelm), or should we
sophisticate the regexp further? FWIW, in no particular order:
(a) env(1) does seem to support mixing up arbitrary options with -S¹, so
in principle it would make sense to support that;
(b) Eli did not seem too found of the regexp hammer², so I don't know
which direction we'd want to go between maximally correct (accept
all arguments, _as long as_ -S|--split-string is in there) or good
enough (just skip over --everything --that --looks --like -a
--switch).
(c) FWIW the "maximally correct" regexp might not be _that_ ugly, since
"-[v]S[OPTION]" must be the *first* token after env; in other words
no need to support --some-option --split-string --more-options.
>> [1] https://debbugs.gnu.org/cgi/bugreport.cgi?bug=64939
>> [2]
>> https://www.gnu.org/software/coreutils/manual/html_node/env-invocation.html#env-invocation
>>
>> YMMV?
>
> Kevin, any comments about the proposals in this bug report?
Comments above; footnotes below. Again, thanks for the heads up.
¹ $ cat demo.sh
#!/usr/bin/env -vS -uFOOBAR bash -eux
echo hi
echo $FOOBAR
echo bye
$ FOOBAR=totally-set ./demo.sh
split -S: ‘ -uFOOBAR bash -eux’
into: ‘-uFOOBAR’
& ‘bash’
& ‘-eux’
unset: FOOBAR
executing: bash
arg[0]= ‘bash’
arg[1]= ‘-eux’
arg[2]= ‘./foo.sh’
+ echo hi
hi
./foo.sh: line 4: FOOBAR: unbound variable
² https://debbugs.gnu.org/cgi/bugreport.cgi?bug=64939#14