bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: IFS delimiter field separation issues


From: Lawrence Velázquez
Subject: Re: IFS delimiter field separation issues
Date: Wed, 08 Jan 2025 18:04:19 -0500

On Wed, Jan 8, 2025, at 1:25 PM, Jeff Ketchum wrote:
> I ran into a strange bug using newer versions of bash, I haven't isolated
> it to a specific release.

It looks like 5.0 introduced the problem.

> In using unicode group separator character U 241D,
> https://www.compart.com/en/unicode/U+241D, 0x241D
> I set the IFS to this unicode, and have U+241E and U+241F characters in the
> data.
> When assigning to an array, and using for var in "${array[@]}"...
> it ends up splitting the data at unexpected locations.
>
> I don't get this behaviour when the array isn't quoted
>
> [...]
>
> I wrote a script that will easily reproduce this:

Here's a version that I think is more legible:

        $ cat /tmp/foo.bash
        LC_ALL=en_US.UTF-8

        gs=$'\u241D'
        rs=$'\u241E'
        us=$'\u241F'

        data="a${gs}b${rs}c${us}d"

        IFS=$gs

        # Original variable
        printf '"$data" - %q\n' "$data"
        printf ' $data  - %q\n' $data
        echo

        # Positional parameters
        set -- $data
        printf '"$@" - %q\n' "$@"
        printf ' $@  - %q\n' $@
        echo

        # Multi-element array
        arr1=($data)
        declare -p arr1
        printf '"${arr1[@]}" - %q\n' "${arr1[@]}"
        printf ' ${arr1[@]}  - %q\n' ${arr1[@]}
        echo

        # Single-element array
        arr2=("$data")
        declare -p arr2
        printf '"${arr2[@]}" - %q\n' "${arr2[@]}"
        printf ' ${arr2[@]}  - %q\n' ${arr2[@]}

        $ ~/build/bash-5.3-testing/bash /tmp/foo.bash
        "$data" - a␝b␞c␟d
         $data  - a
         $data  - b␞c␟d

        "$@" - a
        "$@" - b␞c␟d
         $@  - a
         $@  - b␞c␟d

        declare -a arr1=([0]="a" [1]="b␞c␟d")
        "${arr1[@]}" - a
        "${arr1[@]}" - $'b\342'
        "${arr1[@]}" - $'\236c\342'
        "${arr1[@]}" - $'\237d'
         ${arr1[@]}  - a
         ${arr1[@]}  - b␞c␟d

        declare -a arr2=([0]="a␝b␞c␟d")
        "${arr2[@]}" - $'a\342'
        "${arr2[@]}" - ''
        "${arr2[@]}" - $'b\342'
        "${arr2[@]}" - $'\236c\342'
        "${arr2[@]}" - $'\237d'
         ${arr2[@]}  - a
         ${arr2[@]}  - b␞c␟d

It's interesting that "$@" works fine, while "${arr[@]}" doesn't.

-- 
vq



reply via email to

[Prev in Thread] Current Thread [Next in Thread]