Docjar: A Java Source and Docuemnt Enginecom.*    java.*    javax.*    org.*    all    new    plug-in

Quick Search    Search Deep

com.eireneh.bible.book.raw
Class WordItemsMem  view WordItemsMem download WordItemsMem.java

java.lang.Object
  extended bycom.eireneh.bible.book.raw.Mem
      extended bycom.eireneh.bible.book.raw.ItemsMem
          extended bycom.eireneh.bible.book.raw.WordItemsMem
All Implemented Interfaces:
Items

public class WordItemsMem
extends ItemsMem

The WordItemsMem stores words in a dictionary for a Bible. The single method that will be of use 99% of the time is the getWord(int) method. This method will be called once for each word every time we display a verse (Assuming that we have not implemented any caches).

The getIndex(String) method is the reverse of this, and is used in creating the index in the first place.

The class has an underlying File however this is transarent to the user, since calls to getIndex(String) have any disk changes automatically written to disk, and the implementation of this class must be free to choose whatever cacheing scheme it needs.

The index file size will be roughly n*(a+v) where:

This would give an index file size of 150k. I need to check with the OLB and with Theopholos, however I think this compares favorably. It would make the smallest download that contained Bible text (but no punctuation or case marks, etc) under 200k before compression, maybe under 150k after. A full basic extensible OLB in under 200k would be a achievement and well under a 2 minute download.

Index File Structure

I expect that the general layout will be something like:
 0 -.    \
 1 -+.   !
 2 -++.  ! index area
 .  !!!  !
 .  !!!  /
 a <'!!  \
 a   !!  !
 r   !!  !
 o   !!  ! text area
 n   !!  !
 a <-'!  !
 b    !  !
 .    !  /
 
For this layout we can use the index of word (n+1) to calculate the length of word (n) (so long as the words are in index order in the text area. This would make v=1 (for the index). We could even use upper case letters to mark new words - this would mean we could have an out of order text area, or no index area (i.e. v=0) However having v=0 would force us to do in memory cacheing.

The OLB v8 seems to do some form of (offset,length) indexing to compress files sizes further (or is it simply to obfusticate the file format?) I'd rather use .zip technology for compression.

Consider whether and to what extent this class should be static and public. I think that it should be package scope - Use of this class does not make sense outside of the RawBible package. There should only ever be one WordIndex for a given file, but if we can instansiate this class for several sets of files - it does not make sense to make it static.

How can we extend this class in the future?

Distribution Licence:
Project B is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License, version 2 as published by the Free Software Foundation.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
The License is available on the internet here, by writing to Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA, Or locally at the Licence link below.
The copyright to this program is held by it's authors.

Version:
D0.I0.T0

Field Summary
protected static com.eireneh.util.Logger log
          The log stream
 
Fields inherited from class com.eireneh.bible.book.raw.ItemsMem
array, count, hash
 
Fields inherited from class com.eireneh.bible.book.raw.Mem
create, leafname, raw
 
Constructor Summary
WordItemsMem(RawBible raw, boolean create)
          Create a WordMemResourceIndex from a File that contains the dictionary.
 
Method Summary
 int getIndex(java.lang.String data)
          This is a specialization of IndexedResource.getIndex(String) that ensures that the word is lower case before we insert it.
 int getMaxItems()
          How many items are there in this index?
 java.lang.String[] getStartsWith(java.lang.String word)
          Find a list of words that start with the given word
 void load(java.io.InputStream in)
          Load the Resource from a stream
 void save(java.io.OutputStream out)
          Ensure that all changes to the index of words are written to a stream
 
Methods inherited from class com.eireneh.bible.book.raw.ItemsMem
defaultLoad, defaultSave, getEnumeration, getIndex, getItem, init, size
 
Methods inherited from class com.eireneh.bible.book.raw.Mem
load, save
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface com.eireneh.bible.book.raw.Items
save
 

Field Detail

log

protected static com.eireneh.util.Logger log
The log stream

Constructor Detail

WordItemsMem

public WordItemsMem(RawBible raw,
                    boolean create)
             throws java.lang.Exception
Create a WordMemResourceIndex from a File that contains the dictionary.

Method Detail

getIndex

public int getIndex(java.lang.String data)
This is a specialization of IndexedResource.getIndex(String) that ensures that the word is lower case before we insert it.

Specified by:
getIndex in interface Items
Overrides:
getIndex in class ItemsMem

getMaxItems

public int getMaxItems()
How many items are there in this index?

Specified by:
getMaxItems in class ItemsMem

getStartsWith

public java.lang.String[] getStartsWith(java.lang.String word)
                                 throws com.eireneh.bible.book.BookException
Find a list of words that start with the given word


load

public void load(java.io.InputStream in)
          throws java.io.IOException,
                 java.lang.ClassNotFoundException
Load the Resource from a stream

Specified by:
load in class Mem

save

public void save(java.io.OutputStream out)
          throws java.io.IOException
Ensure that all changes to the index of words are written to a stream

Specified by:
save in class Mem