|
Manual
Terms to know:
- The word "program" refers to the program itself,
which is used to open dictionaries.
- The word "dictionary" refers to a dictionary which
actually contains the word list and words definition.
A dictionary has 2 files:
the data file (*.dz) and index file (*.index). DictzCE is a dictionary interface, that means it contains no word or definition.
Instead, it will open any dictionary in dictzip
format, which are available
for free on the internet.
Dictzip is the format where a flat dictionary file is
divided into small chunks and compressed into zip format, then create an index
file to map the address of a word in the compressed file. Thus, a dictzip dictionary
includes 2 files: the compressed data file (*.dz)
and the index file (*.index).
Installation:
How to use:
- Tap on
menu File/Open dictionary, point to the folder where the
dictionaries are.
Open the data file (eng-fra.dz, for example), the index file (eng-fra.index) is
read automatically.
- If you load a dictionary for the first time, it
will need to create the rough index file (which make the searching on the index
much faster without taking too much memory). Be patient, some huge index files,
such as WorldNet, might take minutes (just imagine copying a 2MB file from your
PDA to your PC). The next time you load it, the rough index file will be read
automatically.
- After loading a dictionary, tap on the upper
small edit box, type in the word you want to look for, and tap "Look up". It
might take several seconds to look up a word depend on how big the index file
is. Word is case-sesitive.
- The only button on the toolbar is the List button.
Tap it to turn on or off the Word List and Navigating Buttons. When you
look for a word, the list will show neighboring words of the word you entered.
- Use the L- button to go to previous section, L+
button to go to next section (section is divided based on the rough index,
either next letter or next 15KB).
- Use the P- button to go back one list (40 words is
the size of the list), P+ button to go forward one list.
- You can go to File/Setting to change the font name,
font size, or set the current dictionary defaulted. A default dictionary will be
loaded aumatically when the program starts.
- Iif you download dictionaries from www.freedict.de,
you need to change the extension of the data file
from *.dict.dz
to *.dz. Also, PocketPC2002 only allow you to scan
documents under My Document folder, so remember
to put them there.
- In order to display language other than English,
you need to install Unicode font(s) on your
PDA. I heard that Times New Roman or Tahoma
are the most completed fonts.
Notes: for some dictionaries, such as
Worldnet, they do have copyright information. Please read them before using the
dictionary. For most dictionaries, the copyright information is the definition
of these words (entries):
00-database-info 00-database-long 00-database-short 00-database-url
Misc. information:
Dictzip format:
- If you leave a dictionary as a flat database
(textfile), its size will be very large unneccessarily. But if you just
compress it like using Winzip, everytime you want to look for something,
you always have to start from the beginning of the compressed file in
order to extract the needed information.
- Dictzip divide the text file into 64KB
chunks, then compresses them one by one appending into a file. As it
processes the text file, it also create an index file, containing the
list of words and their coresponding adresses in the compresses file.
- When you look for a word, the program will
look for that word in the index file together with the adress, then
go to the beginning of the appropriate chunk to extract that word's
definition. Since we only have to go the beginning of a chunk, not the
beginning of the file, the speed will be increase a lot without losing
the ability to compress the database (dictzip compress about 94%-96%
comparing to zipping the file as a whole).
DictzCE strategy:
- Because the original idea was applied to
desktop computers and web servers, there was nothing wrong with
loading the index file into memory for fast searching. However, for
PDA, loading a 2MB index file into memory would be a problem since many
devices have only a small amount of RAM (Casio BE-300 for example, have
only 16MB RAM, but only about 5MB available at run time). Therefore,
DictzCE make another rough index file for the index file. This is a
trade-off between memory consuming and speed. This is also the reason
why you have to crawl on the word list page-by-page instead of having
a whole word list.
- Since the dictionaries out there are user-created,
they are not following a very strict standard. Therefore, DictzCE had
to be flexible to adapt to those varieties. Usually, the index file
will list words in alphabetical order of a language. DictzCE will remember
this order when it browses the index file for the first time, then use
this information to devide the word list into smaller sections (for example,
the 'S' section in English contains number of words more than the sum
of many other sections such as 'X', 'Y', 'Z' ..., if we could divide
'S' into 'Sa', 'Sb', 'Sc')
- Many languages share the same characters
but they don't mean the same. This is really a big problem since the
unicode table is not perfectly sorted. So if you encounter problem in
a specific situation, especicallly language other than English, don't
be surprise. Feel free to send me an email so that I can fix it.
|