Re: Gawk and non-ASCII characters

bug-gnu-utils

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Gawk and non-ASCII characters

From:	Eli Zaretskii
Subject:	Re: Gawk and non-ASCII characters
Date:	Sat, 16 Oct 2010 15:01:00 +0200

> Date: Sat, 16 Oct 2010 08:22:56 -0400
> From: Charles Kozierok <address@hidden>
> 
> I am grabbing HTML code from a site that has some non-ASCII codes in
> it. Specifically, the code is "C2 A0". This shows up in ANSI as a
> capital "A" with a circumflex on top followed by a space. In ASCII it
> becomes a regular "A" followed by a space.
> 
> I need to be able to properly identify these so I can get rid of them,

What exactly do you mean by "these"?  Do you mean the sequence "C2
A0", or do you want to identify each one of them individually?

> but I can't figure out how to do it. The character doesn't seem to
> match any character codes within gawk, and I can't find any command
> line or option settings to either filter them out or have them be
> dealt with properly.

What is your locale?  (If this is on GNU/Linux, the `locale' command
will show that.)

[Prev in Thread]

Current Thread

[Next in Thread]

Gawk and non-ASCII characters, Charles Kozierok, 2010/10/16
- Re: Gawk and non-ASCII characters, Eli Zaretskii <=
  - Re: Gawk and non-ASCII characters, Charles Kozierok, 2010/10/16
    - Re: Gawk and non-ASCII characters, Eli Zaretskii, 2010/10/16
    - Re: Gawk and non-ASCII characters, Charles Kozierok, 2010/10/16
    - Re: Gawk and non-ASCII characters, Eli Zaretskii, 2010/10/16
    - Re: Gawk and non-ASCII characters, Charles Kozierok, 2010/10/16
    - Re: Gawk and non-ASCII characters, Eli Zaretskii, 2010/10/16
    - Re: Gawk and non-ASCII characters, Charles Kozierok, 2010/10/16
- Re: Gawk and non-ASCII characters, John Cowan, 2010/10/16

Prev by Date: Gawk and non-ASCII characters
Next by Date: Re: Gawk and non-ASCII characters
Previous by thread: Gawk and non-ASCII characters
Next by thread: Re: Gawk and non-ASCII characters
Index(es):
- Date
- Thread