[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[GNUnet-developers] meta-data and keyword encoding [Was: Music insertion
From: |
Christian Grothoff |
Subject: |
[GNUnet-developers] meta-data and keyword encoding [Was: Music insertion] |
Date: |
Sun, 5 Dec 2004 15:09:47 -0500 |
User-agent: |
KMail/1.7.1 |
On Friday 03 December 2004 17:56, N. Durner wrote:
> Hi,
>
> > To find precisely the music files and albums, I use keywords like
> > <title:>foo or <encoding:>ogg.
>
> There's a request for a date field (Mantis #789), too.
> Perhaps we should put all the meta-data into an extensible format with
> certain fixed and well-known fields (the ones you mentioned) in GNUnet 0.7.
Actually, I was thinking of having a format with a variable set of fields, but
with well-known field types (using the list of libextractor, extended by some
more entries).
> > Is this a problem or not ?
>
> Rather not.
>
> > Other question : are keywords in UTF-8 ?
>
> Not yet. It's planed for 0.7.
>
> > And which encoding does
> > libextractor use ?
>
> Plain ASCII using your locale.
Not quite. libextractor currently returns whatever was in the file and
totally ignores character sets. libextractor _should_ be changed to use
UTF-8 everywhere (convert if necessary, guess format if file format does not
specify, when writing to the console convert from UTF-8 to locale). But
that's not currently the case, but I'd definitively like to have LE use
UTF-8. So if anyone wants to even start on this, please let me know.
> > In other words : should I convert keywords before
> > inserting them ?
>
> It doesn't make too much sense at the moment.
It does. It will be the future default, so I would recommend new code to
convert to UTF-8 if possible.
> I have thought about a module for libExtractor that converts special
> national characters to an alternative representation. For example, the
> German umlauts ä, ö and ü can be written as ae, oe and ue. Is there a
> similiar rule for other characters like "ç" (c cedille)?
> This would be a solution to the problem that I usually don't know how to
> type these chars using a foreign keyboard layout.
Character input is a different (UI) issue. I would prefer to just handle
character sets correctly and nicely in GNUnet/LE, and that means UTF-8. As
for typing umlauts on a non-German keyboard, that does not feel like a
problem that we should even try to address (it gets far too complicated once
you add more and more languages -- and I have the impression that most people
had to figure out how to type their native language on an English keyboard,
it's mostly us Germans that are lazy and use "ae" :-).
Christian
- Re: [GNUnet-developers] Music insertion, (continued)
- Re: [GNUnet-developers] Music insertion, Marcos D. Marado Torres, 2004/12/05
- Re: [GNUnet-developers] Music insertion, Christian Grothoff, 2004/12/05
- Re: [GNUnet-developers] Music insertion, N. Durner, 2004/12/05
- Re: [GNUnet-developers] Music insertion, Milan, 2004/12/05
- Re: [GNUnet-developers] Music insertion, Christian Grothoff, 2004/12/05
- Re: [GNUnet-developers] Music insertion, Marcos D. Marado Torres, 2004/12/05
Re: [GNUnet-developers] Music insertion, Alexander Winston, 2004/12/04
[GNUnet-developers] meta-data and keyword encoding [Was: Music insertion],
Christian Grothoff <=