bug-sed
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#42857: sed: handling utf8 non-breaking space 0xA0


From: Dennis Nezic
Subject: bug#42857: sed: handling utf8 non-breaking space 0xA0
Date: Thu, 13 Aug 2020 22:22:28 -0400

I'm not sure if this is a bug. It has to do with the weird utf8(?)
character with hex code 0xa0.

Given this 3 line sample text, sed is able to delete the second line
(that contains the offending character) properly:

  echo $'hello\nte\xA0st\nworld' | sed 2d


But it can't do a proper subsitution/regex with it, for example:

  echo $'hello\nte\xA0st\nworld' | sed 2s,^t.*,x,

it seems to interpret 0xa0 as the end of the line.

In this case it seems to interpret it as the beginning of the line:

  echo $'hello\nte\xA0st\nworld' | sed 2s,.*st$,x,





reply via email to

[Prev in Thread] Current Thread [Next in Thread]