[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: sed error message reports byte position instead of char position whe
From: |
John Cowan |
Subject: |
Re: sed error message reports byte position instead of char position when program contains UTF-8 |
Date: |
Thu, 16 May 2013 11:41:00 -0400 |
User-agent: |
Mutt/1.5.20 (2009-06-14) |
Eli Zaretskii scripsit:
> AFAIK, Sed uses bytes, not characters.
Definitely not. Look at the following:
$ echo $LANG
en_US.UTF-8
$ cat >foo
föö
(Ctrl-D)
$ wc -c foo
6 foo
(including the newline; therefore the file is UTF-8)
$ sed -n '/^...$/p' <foo
föö
$ sed -n '/^.....$/p' <foo
$
So the regex matches 3 characters, not 5 bytes.
--
We call nothing profound address@hidden
that is not wittily expressed. John Cowan
--Northrop Frye (improved)
- Re: sed error message reports byte position instead of char position when program contains UTF-8, (continued)
- Re: sed error message reports byte position instead of char position when program contains UTF-8, Eli Zaretskii, 2013/05/14
- Re: sed error message reports byte position instead of char position when program contains UTF-8, Paul Jarc, 2013/05/15
- Re: sed error message reports byte position instead of char position when program contains UTF-8, Eli Zaretskii, 2013/05/15
- Re: sed error message reports byte position instead of char position when program contains UTF-8, Camion SPAM, 2013/05/16
- Re: sed error message reports byte position instead of char position when program contains UTF-8, Jose E. Marchesi, 2013/05/16
- Re: sed error message reports byte position instead of char position when program contains UTF-8, Eli Zaretskii, 2013/05/16
- Re: sed error message reports byte position instead of char position when program contains UTF-8, John Cowan, 2013/05/16
- Re: sed error message reports byte position instead of char position when program contains UTF-8, Eli Zaretskii, 2013/05/16
- Re: sed error message reports byte position instead of char position when program contains UTF-8, John Cowan, 2013/05/16
- Re: sed error message reports byte position instead of char position when program contains UTF-8, Eli Zaretskii, 2013/05/16
- Re: sed error message reports byte position instead of char position when program contains UTF-8,
John Cowan <=
- Re: sed error message reports byte position instead of char position when program contains UTF-8, Errembault Philippe, 2013/05/16