trans-coord-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Non-ascii characters in the original articles


From: Kaloian Doganov
Subject: Re: Non-ascii characters in the original articles
Date: Thu, 07 Feb 2008 09:35:16 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.50 (gNewSense gnu/linux)

I've ran another query returning all files that declare encoding,
different from UTF-8:

find . | grep "[.]s\?html$" | grep -v "[.]\(..\|zh-..\)[.]s\?html$" \
       | grep -v "^[.]/www\(in\|\es\)" | grep -v "^[.]/spanish" \
       | grep -v "^[.]/japan" | grep -v "^[.]/chinese" \
       | xargs grep -L '#include virtual="/server/header.html"' \
       | xargs grep -i -o 'charset=[-a-zA-Z0-9]\+' | grep -v -i utf-8

The output is:

./encyclopedia/announcement.html:charset=iso-8859-1
./software/chinese/sandbox/index.html:charset=iso8859-1
./software/chinese/index.html:charset=iso8859-1
./software/diction/diction.html:charset=iso-8859-1
./software/ncurses/ncurses.html:charset=iso-8859-1
./software/panorama/developers.html:charset=iso-8859-1
./software/panorama/procedural_language.html:charset=iso-8859-1
./software/panorama/development.html:charset=iso-8859-1
./software/panorama/download.html:charset=iso-8859-1
./software/panorama/gallery.html:charset=iso-8859-1
./software/panorama/panorama.html:charset=us-ascii
./software/rcs/rcs.html:charset=iso-8859-1

Of course, files that do not declare any encoding (neither by inclusion
of header.html nor by themselves) are not included in this list.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]