[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[silpa-discuss] Dictionary Index Generator
From: |
Vasudev Kamath |
Subject: |
[silpa-discuss] Dictionary Index Generator |
Date: |
Sun, 25 Apr 2010 17:00:06 +0530 |
User-agent: |
KMail/1.12.4 (Linux/2.6.33-2.slh.10-sidux-686; KDE/4.3.4; i686; ; ) |
Hi,
Finally I was able to write a script which generates index file for a given
dictionary.
Here are some assumption
1. If file is english dictionary it is opened with normal
open since english dictionary encoding is IS0-8859 else files are opened with
utf-8 encoding.
2. For english small and capital letters are treated differently since words
with a and A start at different locations in the dictionary. For fixing this
dictionary needs to be fixed
3. I used cPickle instead of saving index as normal file, cPickle with protocol
2 is used for efficiency purpose and hence index file won't be human readable.
Reason for using Python pickles as file format is to just reduce the complexity
of processing index file. If desired we can create index as normal file. I need
suggestions on this
I'm attaching the dictionary indexing script please test it and let me know of
any changes that needs to be done.
Thanks and Regards
Vasudev Kamath
indexer.py
Description: Text Data
- [silpa-discuss] Dictionary Index Generator,
Vasudev Kamath <=