On 9/27/19 7:52 PM, Geoff Kuenning wrote:
Version:
GNU bash, version 4.4.23(1)-release (x86_64-suse-linux-gnu)
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
Behavior:
If a pathname contains nonprinting characters, and is expanded
from a
variable name, wildcard expansion can sometimes fail.
This is an interesting report. The $'\361' is a unicode
combining
character, which ends up making the entire sequence of
characters an
invalid wide character string in a bunch of different locales.
Some file systems (Mac OS X APFS) don't allow you to create
files with
invalid characters or character sequences in their names. Others
(Linux)
don't have a problem with it.
The code to dequote filenames that's needed for "$x" tries to
fall back to
single-byte character operations in the presence of invalid
character or
byte sequences, but that means you can't use any of the standard
wide
character functions to check for valid and invalid wide
character strings.
The change between bash-4.4 and bash-5.0 is that the globbing
code doesn't
bother to try and convert to wide characters to do the dequoting
if there
aren't any valid multibyte characters in the pathname, but uses
the single
byte character code instead. That works for this case, but
doesn't work for
pathnames that have both valid and invalid wide character
sequences.
A better fix is to write a symmetric function that will take the
output of
xdupmbstowcs2 (bash's replacement for mbstowcs that handles
zero-length
wide character strings that aren't null wide characters) and
handle the
invalid wide character strings that may result from it. I'll
make that fix
for the next release.
Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU chet@case.edu
http://tiswww.cwru.edu/~chet/