[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-wget] Incorrect handling of Cyrillic characters in http request - a
From: |
Stephen Wells |
Subject: |
[Bug-wget] Incorrect handling of Cyrillic characters in http request - any workaround? |
Date: |
Tue, 31 Mar 2015 18:11:58 +0100 |
Dear all - I am currently trying to use wget to obtain mp3 files from the
Google Translate TTS system. In principle this can be done using:
wget -U Mozilla -O "${string}.mp3" "
http://translate.google.com/translate_tts?tl=TL&q=${string}"
where TL is a twoletter language code (en,fr,de and so on).
However I am meeting a serious error when I try to send Russian strings
(tl=ru) in Cyrillic characters. I'm working in a UTF-8 environment (under
Cygwin) and the file system will display the cyrillic strings no problem.
If I provide a command like this:
http://translate.google.com/translate_tts?tl=ru&q=мазать
wget incorrectly processes the Cyrillic characters _before_ sending the
http request, so what it actually requests is:
http://translate.google.com/translate_tts?tl=ru&q=%D0%BC%D0%B0%D0%B7%D0%B0%D1%82%D1%8C
This of course produces a string of gibberish in the resulting mp3 file!
Is there any way to make wget actually send the string it is given, instead
of mangling it on the way out? This is really blocking me.
Cheers,
Stephen
- [Bug-wget] Incorrect handling of Cyrillic characters in http request - any workaround?,
Stephen Wells <=