libextractor
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[libextractor] Re: Keyword suggestions


From: Christian Grothoff
Subject: [libextractor] Re: Keyword suggestions
Date: Tue, 13 Sep 2005 04:58:52 -0700
User-agent: KMail/1.7.2

On Monday 12 September 2005 09:04, you wrote:
> Hello Christian,
>
> LE 0.x uses the keyword format in polysemic fashion (in my opinion).
> It's OK (IMO again)  for PDF files (this is commonly understood as 'file
> format'). But then it is not OK for OLE2/Word files where it really
> means 'style sheet' or 'template'.
>
> There is also another source of confusion with PDF files
> stemming from the Adobe definitions for 'Creator' and 'Producer'.
> I believe their definition conflicts with Dublin Core
> (Dublin Core non-expert speaking here) and most common acceptions.
>
> I think we should render PDF Creator as 'application', or plain
> 'software' and their 'Producer' as (maybe) 'Driver' ?
>
> We need a keyword dictionary (or glossary ?)...

I would be very, very happy if someone would write (or draft) a keyword 
glossary (would also help the translators!) and would revisit the extractors 
to improve the mapping of KeywordType to the formats.

A function "const char * EXTRACTOR_getKeywordTypeDescription(KeywordType)" 
would even be useful (i.e. for applications that allow the user to manually 
enter bindings and where the apps want to give the user some description of 
the semantics).

Now, I'm neither a Dublin Core expert nor do I have the time at the moment -- 
but I would take a patch (even one that is just making some progress) any 
time (hint hint).

Happy hacking

Christian




reply via email to

[Prev in Thread] Current Thread [Next in Thread]