[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
sed UTF-8 processing problem
From: |
Klaus Dechet |
Subject: |
sed UTF-8 processing problem |
Date: |
Mon, 14 Jun 2021 23:15:31 +0200 |
User-agent: |
Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.0 |
Hi GNU team,
I have the following problem:
Running sed in windows 10 cmd terminal.
sed --version
GNU sed version 4.2.1
Copyright (C) 2009 Free Software Foundation, Inc.
In cmd terminal I enter the following:
D:\Temp>chcp 6500
D:\Temp>echo aΣb
aΣb
D:\Temp>echo aΣb > utf82.txt
File utf82.txt is utf-8 encoded and has Σ encoded in 2 bytes (\u03A3)
D:\Temp>echo aΣb | sed s/./X/g
XXXXX
This shows that sed is not processing UTF-8 encoding properly.
D:\Temp>echo aΣb | sed s/./X/g > sedoutput.txt
sedoutput.txt is ANSI-1252 encoded.
Question: How do I get sed to handle and produce UTF-8 encoded files per
default?
Additional background: Installed sed and libraries from here:
http://gnuwin32.sourceforge.net/packages/sed.htm
Thank you.
Klaus
- sed UTF-8 processing problem,
Klaus Dechet <=