[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Gawk and non-ASCII characters
From: |
Eli Zaretskii |
Subject: |
Re: Gawk and non-ASCII characters |
Date: |
Sat, 16 Oct 2010 15:01:00 +0200 |
> Date: Sat, 16 Oct 2010 08:22:56 -0400
> From: Charles Kozierok <address@hidden>
>
> I am grabbing HTML code from a site that has some non-ASCII codes in
> it. Specifically, the code is "C2 A0". This shows up in ANSI as a
> capital "A" with a circumflex on top followed by a space. In ASCII it
> becomes a regular "A" followed by a space.
>
> I need to be able to properly identify these so I can get rid of them,
What exactly do you mean by "these"? Do you mean the sequence "C2
A0", or do you want to identify each one of them individually?
> but I can't figure out how to do it. The character doesn't seem to
> match any character codes within gawk, and I can't find any command
> line or option settings to either filter them out or have them be
> dealt with properly.
What is your locale? (If this is on GNU/Linux, the `locale' command
will show that.)
- Gawk and non-ASCII characters, Charles Kozierok, 2010/10/16
- Re: Gawk and non-ASCII characters,
Eli Zaretskii <=
- Re: Gawk and non-ASCII characters, Charles Kozierok, 2010/10/16
- Re: Gawk and non-ASCII characters, Eli Zaretskii, 2010/10/16
- Re: Gawk and non-ASCII characters, Charles Kozierok, 2010/10/16
- Re: Gawk and non-ASCII characters, Eli Zaretskii, 2010/10/16
- Re: Gawk and non-ASCII characters, Charles Kozierok, 2010/10/16
- Re: Gawk and non-ASCII characters, Eli Zaretskii, 2010/10/16
- Re: Gawk and non-ASCII characters, Charles Kozierok, 2010/10/16
Re: Gawk and non-ASCII characters, John Cowan, 2010/10/16