lilypond-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: problems with german umlauts


From: yota moteuchi
Subject: Re: problems with german umlauts
Date: Thu, 25 Jan 2007 16:37:30 -0500

Well, since charsets issue is my hobby ... I can write a short explanation (try to... my shortage of english vocabulary could be an issue)

After having defined a 128 character table (0 -127 on 8 bits, well one zero + 7bits) covering only the English characters and some signs, called the ASCII table; has been defined many extended tables using the 128 - 255 range to store some "regional" character. There are around 10 extended tables to fit with either the french, the danish, the Greek specificities (but not all at the same time) : http://en.wikipedia.org/wiki/Category:ISO_8859

Japaneses and Chineses had some strange way to store their 36 000 ideograms and there it started to be a mess.

Unicode consortium defined a HUGE table aiming to store every character (well almost, but it's another problem) on 32 bits.
But the devilish ASCII was always here, hidden in the dark. So they ended to design UTF-8 encoding system which is only an amazing trick to store the unicode table :
- all the characters of the former 0 - 127 range of the ascii table are stored on 8bits... so a pure ascii file is also a genuine UTF-8 file ^^
- To store other characters they use an ingeniously designed system of drawers using the characters from 128 to 255 (and some more bytes if necessary)

UTF-8 is the only way to write both in danish AND french on the same text... and it is fully compatible with ASCII files...
nice isn't it ?

Yota

hope it's clear... hips
I could explain this more easily, in french, with a whiteboard and a cup of coffee

On 1/25/07, Mats Bengtsson <address@hidden> wrote:
You are mistaken. ASCII only defines character codes up to 127, see for
example http://www.asciitable.com/.
What your table shows is probably Latin1 (ISO 8859-1).

   /Mats

Quoting Jonathan Henkelman <address@hidden>:

> Mats Bengtsson <mats.bengtsson <at> ee.kth.se> writes:
>
>
>
>> If you search the mailing list archives from the time before we introduced
>> unicode support, you will be surprised how many questions there are related
>> to Russian or Hebrew or Mandarin or ...
>>
>>    /Mats
>
> It wasn't intended to be a stupid question. I'm all over unicode for
> languages
> that use other character sets - cyrillic, hebrew, asian etc.  I was just
> surprised at how difficult it was to put an umlaut on a u for a
> german peice I
> was typesetting.
>
> Perhaps the problem lies in the documentation.  It suggests that if you want
> to use "non-ascii" characters you have to save the document as unicode - fair
> enough. (In fact it implies you can use any 8-bit ascii pg. 112, last
> paragrph, PDF version 2.10.0)  But I wanted to use ascii 252 (presumably
> similar to David in the original post) and I just inserted it into my
> document - and it compiled to a space.  Here I am trying to use an ascii
> character and hence expect not to have to do anything special, but would I
> still have to save it as unicode?  When I used \char, I had to find the tweak
> to get rid of the spaces before and after that character...
>
>> Because most accented European characters can not be accessed within
> ascii
>
> My ascii table shows all French, Norwegian, Danish characters as well as most
> spanish, and german (can't profess to be an expert there) see characters 191-
> 255 (xBF - xff).  Are these accessable in a non-unicode document?
>
> Thanks,
> J
>
>
>
>
>
> _______________________________________________
> lilypond-user mailing list
> address@hidden
> http://lists.gnu.org/mailman/listinfo/lilypond-user
>





_______________________________________________
lilypond-user mailing list
address@hidden
http://lists.gnu.org/mailman/listinfo/lilypond-user


reply via email to

[Prev in Thread] Current Thread [Next in Thread]