Save This Page
Home » lucene-2.3.2-src » org.apache » lucene » index » [javadoc | source]
org.apache.lucene.index
abstract public class: IndexReader [javadoc | source]
java.lang.Object
   org.apache.lucene.index.IndexReader

Direct Known Subclasses:
    ParallelReader, MultiSegmentReader, SegmentReader, MemoryIndexReader, TestReader, OneNormsReader, GCJSegmentReader, MultiReader, FilterIndexReader, DirectoryIndexReader

IndexReader is an abstract class, providing an interface for accessing an index. Search of an index is done entirely through this abstract interface, so that any subclass which implements it is searchable.

Concrete subclasses of IndexReader are usually constructed with a call to one of the static open() methods, e.g. #open(String) .

For efficiency, in this API documents are often referred to via document numbers, non-negative integers which each name a unique document in the index. These document numbers are ephemeral--they may change as documents are added to and deleted from an index. Clients should thus not rely on a given document having the same number between sessions.

An IndexReader can be opened on a directory for which an IndexWriter is opened already, but it cannot be used to delete documents from the index then.

NOTE: for backwards API compatibility, several methods are not listed as abstract, but have no useful implementations in this base class and instead always throw UnsupportedOperationException. Subclasses are strongly encouraged to override these methods, but in many cases may not need to.

Nested Class Summary:
public static final class  IndexReader.FieldOption  Constants describing field properties, for example used for {@link IndexReader#getFieldNames(FieldOption)}. 
Field Summary
protected  boolean hasChanges     
Constructor:
 protected IndexReader() 
 protected IndexReader(Directory directory) 
    Legacy Constructor for backwards compatibility.

    This Constructor should not be used, it exists for backwards compatibility only to support legacy subclasses that did not "own" a specific directory, but needed to specify something to be returned by the directory() method. Future subclasses should delegate to the no arg constructor and implement the directory() method as appropriate.

    Parameters:
    directory - Directory to be returned by the directory() method
    Also see:
    directory()
Method from org.apache.lucene.index.IndexReader Summary:
acquireWriteLock,   close,   commit,   decRef,   deleteDocument,   deleteDocuments,   directory,   doClose,   doCommit,   doDelete,   doSetNorm,   doUndeleteAll,   docFreq,   document,   document,   ensureOpen,   flush,   getCurrentVersion,   getCurrentVersion,   getCurrentVersion,   getFieldNames,   getRefCount,   getTermFreqVector,   getTermFreqVector,   getTermFreqVector,   getTermFreqVectors,   getTermInfosIndexDivisor,   getVersion,   hasDeletions,   hasNorms,   incRef,   indexExists,   indexExists,   indexExists,   isCurrent,   isDeleted,   isLocked,   isLocked,   isOptimized,   lastModified,   lastModified,   lastModified,   main,   maxDoc,   norms,   norms,   numDocs,   open,   open,   open,   open,   reopen,   setNorm,   setNorm,   setTermInfosIndexDivisor,   termDocs,   termDocs,   termPositions,   termPositions,   terms,   terms,   undeleteAll,   unlock
Methods from java.lang.Object:
equals,   getClass,   hashCode,   notify,   notifyAll,   toString,   wait,   wait,   wait
Method from org.apache.lucene.index.IndexReader Detail:
 protected synchronized  void acquireWriteLock() throws IOException 
    Does nothing by default. Subclasses that require a write lock for index modifications must implement this method.
 public final synchronized  void close() throws IOException 
    Closes files associated with this index. Also saves any new deletions to disk. No other methods should be called after this has been called.
 protected final synchronized  void commit() throws IOException 
    Commit changes resulting from delete, undeleteAll, or setNorm operations If an exception is hit, then either no changes or all changes will have been committed to the index (transactional semantics).
 protected synchronized  void decRef() throws IOException 
    Decreases the refCount of this IndexReader instance. If the refCount drops to 0, then pending changes are committed to the index and this reader is closed.
 public final synchronized  void deleteDocument(int docNum) throws IOException, StaleReaderException, CorruptIndexException, LockObtainFailedException 
    Deletes the document numbered docNum. Once a document is deleted it will not appear in TermDocs or TermPostitions enumerations. Attempts to read its field with the #document method will result in an error. The presence of this document may still be reflected in the #docFreq statistic, though this will be corrected eventually as the index is further modified.
 public final int deleteDocuments(Term term) throws IOException, StaleReaderException, CorruptIndexException, LockObtainFailedException 
    Deletes all documents that have a given term indexed. This is useful if one uses a document field to hold a unique ID string for the document. Then to delete such a document, one merely constructs a term with the appropriate field and the unique ID string as its text and passes it to this method. See #deleteDocument(int) for information about when this deletion will become effective.
 public Directory directory() 
    Returns the directory associated with this index. The Default implementation returns the directory specified by subclasses when delegating to the IndexReader(Directory) constructor, or throws an UnsupportedOperationException if one was not specified.
 abstract protected  void doClose() throws IOException
    Implements close.
 abstract protected  void doCommit() throws IOException
    Implements commit.
 abstract protected  void doDelete(int docNum) throws IOException, CorruptIndexException
 abstract protected  void doSetNorm(int doc,
    String field,
    byte value) throws IOException, CorruptIndexException
    Implements setNorm in subclass.
 abstract protected  void doUndeleteAll() throws IOException, CorruptIndexException
    Implements actual undeleteAll() in subclass.
 abstract public int docFreq(Term t) throws IOException
    Returns the number of documents containing the term t.
 public Document document(int n) throws IOException, CorruptIndexException 
    Returns the stored fields of the nth Document in this index.
 abstract public Document document(int n,
    FieldSelector fieldSelector) throws IOException, CorruptIndexException
 protected final  void ensureOpen() throws AlreadyClosedException 
 public final synchronized  void flush() throws IOException 
 public static long getCurrentVersion(String directory) throws IOException, CorruptIndexException 
    Reads version number from segments files. The version number is initialized with a timestamp and then increased by one for each change of the index.
 public static long getCurrentVersion(File directory) throws IOException, CorruptIndexException 
    Reads version number from segments files. The version number is initialized with a timestamp and then increased by one for each change of the index.
 public static long getCurrentVersion(Directory directory) throws IOException, CorruptIndexException 
    Reads version number from segments files. The version number is initialized with a timestamp and then increased by one for each change of the index.
 abstract public Collection getFieldNames(IndexReader.FieldOption fldOption)
    Get a list of unique field names that exist in this index and have the specified field option information.
 synchronized int getRefCount() 
 abstract public TermFreqVector getTermFreqVector(int docNumber,
    String field) throws IOException
    Return a term frequency vector for the specified document and field. The returned vector contains terms and frequencies for the terms in the specified field of this document, if the field had the storeTermVector flag set. If termvectors had been stored with positions or offsets, a TermPositionsVector is returned.
 abstract public  void getTermFreqVector(int docNumber,
    TermVectorMapper mapper) throws IOException
    Map all the term vectors for all fields in a Document
 abstract public  void getTermFreqVector(int docNumber,
    String field,
    TermVectorMapper mapper) throws IOException
    Load the Term Vector into a user-defined data structure instead of relying on the parallel arrays of the TermFreqVector .
 abstract public TermFreqVector[] getTermFreqVectors(int docNumber) throws IOException
    Return an array of term frequency vectors for the specified document. The array contains a vector for each vectorized field in the document. Each vector contains terms and frequencies for all terms in a given vectorized field. If no such fields existed, the method returns null. The term vectors that are returned my either be of type TermFreqVector or of type TermPositionsVector if positions or offsets have been stored.
 public int getTermInfosIndexDivisor() 

    For IndexReader implementations that use TermInfosReader to read terms, this returns the current indexDivisor.

 public long getVersion() 
    Version number when this IndexReader was opened. Not implemented in the IndexReader base class.
 abstract public boolean hasDeletions()
    Returns true if any documents have been deleted
 public boolean hasNorms(String field) throws IOException 
    Returns true if there are norms stored for this field.
 protected synchronized  void incRef() 
    Increments the refCount of this IndexReader instance. RefCounts are used to determine when a reader can be closed safely, i. e. as soon as no other IndexReader is referencing it anymore.
 public static boolean indexExists(String directory) 
    Returns true if an index exists at the specified directory. If the directory does not exist or if there is no index in it. false is returned.
 public static boolean indexExists(File directory) 
    Returns true if an index exists at the specified directory. If the directory does not exist or if there is no index in it.
 public static boolean indexExists(Directory directory) throws IOException 
    Returns true if an index exists at the specified directory. If the directory does not exist or if there is no index in it.
 public boolean isCurrent() throws IOException, CorruptIndexException 
    Check whether this IndexReader is still using the current (i.e., most recently committed) version of the index. If a writer has committed any changes to the index since this reader was opened, this will return false, in which case you must open a new IndexReader in order to see the changes. See the description of the autoCommit flag which controls when the IndexWriter actually commits changes to the index.

    Not implemented in the IndexReader base class.

 abstract public boolean isDeleted(int n)
    Returns true if document n has been deleted
 public static boolean isLocked(Directory directory) throws IOException 
    Returns true iff the index in the named directory is currently locked.
 public static boolean isLocked(String directory) throws IOException 
    Returns true iff the index in the named directory is currently locked.
 public boolean isOptimized() 
    Checks is the index is optimized (if it has a single segment and no deletions). Not implemented in the IndexReader base class.
 public static long lastModified(String directory) throws IOException, CorruptIndexException 
    Returns the time the index in the named directory was last modified. Do not use this to check whether the reader is still up-to-date, use #isCurrent() instead.
 public static long lastModified(File fileDirectory) throws IOException, CorruptIndexException 
    Returns the time the index in the named directory was last modified. Do not use this to check whether the reader is still up-to-date, use #isCurrent() instead.
 public static long lastModified(Directory directory2) throws IOException, CorruptIndexException 
    Returns the time the index in the named directory was last modified. Do not use this to check whether the reader is still up-to-date, use #isCurrent() instead.
 public static  void main(String[] args) 
    Prints the filename and size of each file within a given compound file. Add the -extract flag to extract files to the current working directory. In order to make the extracted version of the index work, you have to copy the segments file from the compound index into the directory where the extracted files are stored.
 abstract public int maxDoc()
    Returns one greater than the largest possible document number. This may be used to, e.g., determine how big to allocate an array which will have an element for every document number in an index.
 abstract public byte[] norms(String field) throws IOException
    Returns the byte-encoded normalization factor for the named field of every document. This is used by the search code to score documents.
 abstract public  void norms(String field,
    byte[] bytes,
    int offset) throws IOException
    Reads the byte-encoded normalization factor for the named field of every document. This is used by the search code to score documents.
 abstract public int numDocs()
    Returns the number of documents in this index.
 public static IndexReader open(String path) throws IOException, CorruptIndexException 
    Returns an IndexReader reading the index in an FSDirectory in the named path.
 public static IndexReader open(File path) throws IOException, CorruptIndexException 
    Returns an IndexReader reading the index in an FSDirectory in the named path.
 public static IndexReader open(Directory directory) throws IOException, CorruptIndexException 
    Returns an IndexReader reading the index in the given Directory.
 public static IndexReader open(Directory directory,
    IndexDeletionPolicy deletionPolicy) throws IOException, CorruptIndexException 
    Expert: returns an IndexReader reading the index in the given Directory, with a custom IndexDeletionPolicy .
 public synchronized IndexReader reopen() throws IOException, CorruptIndexException 
    Refreshes an IndexReader if the index has changed since this instance was (re)opened.

    Opening an IndexReader is an expensive operation. This method can be used to refresh an existing IndexReader to reduce these costs. This method tries to only load segments that have changed or were created after the IndexReader was (re)opened.

    If the index has not changed since this instance was (re)opened, then this call is a NOOP and returns this instance. Otherwise, a new instance is returned. The old instance is not closed and remains usable.
    Note: The re-opened reader instance and the old instance might share the same resources. For this reason no index modification operations (e. g. #deleteDocument(int) , #setNorm(int, String, byte) ) should be performed using one of the readers until the old reader instance is closed. Otherwise, the behavior of the readers is undefined.

    You can determine whether a reader was actually reopened by comparing the old instance with the instance returned by this method:

    IndexReader reader = ...
    ...
    IndexReader new = r.reopen();
    if (new != reader) {
    ... // reader was reopened
    reader.close();
    }
    reader = new;
    ...
    
 public final synchronized  void setNorm(int doc,
    String field,
    byte value) throws IOException, StaleReaderException, CorruptIndexException, LockObtainFailedException 
    Expert: Resets the normalization factor for the named field of the named document. The norm represents the product of the field's boost and its length normalization . Thus, to preserve the length normalization values when resetting this, one should base the new value upon the old.
 public  void setNorm(int doc,
    String field,
    float value) throws IOException, StaleReaderException, CorruptIndexException, LockObtainFailedException 
    Expert: Resets the normalization factor for the named field of the named document.
 public  void setTermInfosIndexDivisor(int indexDivisor) throws IllegalStateException 

    For IndexReader implementations that use TermInfosReader to read terms, this sets the indexDivisor to subsample the number of indexed terms loaded into memory. This has the same effect as IndexWriter#setTermIndexInterval except that setting must be done at indexing time while this setting can be set per reader. When set to N, then one in every N*termIndexInterval terms in the index is loaded into memory. By setting this to a value > 1 you can reduce memory usage, at the expense of higher latency when loading a TermInfo. The default value is 1.

    NOTE: you must call this before the term index is loaded. If the index is already loaded, an IllegalStateException is thrown.
 abstract public TermDocs termDocs() throws IOException
    Returns an unpositioned TermDocs enumerator.
 public TermDocs termDocs(Term term) throws IOException 
    Returns an enumeration of all the documents which contain term. For each document, the document number, the frequency of the term in that document is also provided, for use in search scoring. Thus, this method implements the mapping:

      Term    =>    <docNum, freq>*

    The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.

 abstract public TermPositions termPositions() throws IOException
 public TermPositions termPositions(Term term) throws IOException 
    Returns an enumeration of all the documents which contain term. For each document, in addition to the document number and frequency of the term in that document, a list of all of the ordinal positions of the term in the document is available. Thus, this method implements the mapping:

      Term    =>    <docNum, freq, <pos1, pos2, ... posfreq-1> >*

    This positional information facilitates phrase and proximity searching.

    The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.

 abstract public TermEnum terms() throws IOException
    Returns an enumeration of all the terms in the index. The enumeration is ordered by Term.compareTo(). Each term is greater than all that precede it in the enumeration. Note that after calling terms(), TermEnum#next() must be called on the resulting enumeration before calling other methods such as TermEnum#term() .
 abstract public TermEnum terms(Term t) throws IOException
    Returns an enumeration of all terms starting at a given term. If the given term does not exist, the enumeration is positioned at the first term greater than the supplied therm. The enumeration is ordered by Term.compareTo(). Each term is greater than all that precede it in the enumeration.
 public final synchronized  void undeleteAll() throws IOException, StaleReaderException, CorruptIndexException, LockObtainFailedException 
    Undeletes all documents currently marked as deleted in this index.
 public static  void unlock(Directory directory) throws IOException 
    Forcibly unlocks the index in the named directory.

    Caution: this should only be used by failure recovery code, when it is known that no other process nor thread is in fact currently accessing this index.