[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Question on using gsub() on UTF-8 strings
From: |
Eli Zaretskii |
Subject: |
Re: Question on using gsub() on UTF-8 strings |
Date: |
Mon, 24 Jun 2024 14:22:11 +0300 |
> From: <pjfarley3@earthlink.net>
> Cc: <help-gawk@gnu.org>
> Date: Sun, 23 Jun 2024 15:48:50 -0400
>
> > -----Original Message-----
> > From: help-gawk-bounces+pjfarley3=earthlink.net@gnu.org <help-gawk-
> > bounces+pjfarley3=earthlink.net@gnu.org> On Behalf Of Eli Zaretskii
> > Sent: Sunday, June 23, 2024 4:55 AM
> > To: Manuel Collado <mcollado2011@gmail.com>
> > Cc: pjfarley3@earthlink.net; help-gawk@gnu.org
> > Subject: Re: Question on using gsub() on UTF-8 strings
> >
> > > Date: Sun, 23 Jun 2024 09:58:57 +0200
> > > Cc: help-gawk@gnu.org
> > > From: Manuel Collado <mcollado2011@gmail.com>
> > >
> > > El 23/6/24 a las 8:39, pjfarley3@earthlink.net escribió:
> > > >> ...
> > > >> So the only recommendation I have is to recode the text in some
> > > > single-byte
> > > >> codepage supported by Windows, preferably your system codepage, and
> > > >> then Gawk should work.
> > > > ...
> > > > *Sigh* I was afraid that might be the answer...
> > > >
> > > > I do have a gawk process to detect and convert a lot of the UTF-8
> > > > characters to semi-reasonable single-byte substitutes, but it is a
> > > > little cumbersome and really needs a generalization rewrite.
> > >
> > > There are Windows ports of the 'iconv' (libiconv) utility.
> >
> > Yes. And also of 'recode'.
>
> Where did you see a Windows binary distribution of recode please? I saw one
> reference to a Gnuwin version of recode but the Gnuwin download page at
> sourceforge does not actually have one.
Not sure, I have it for a long time. I think it's indeed from that
GnuWin32 site, yes. Don't know what happened to the distribution
itself, and for some reason I don't have the distribution, either...