| Method from org.apache.lucene.index.IndexReader Detail: |
protected synchronized void acquireWriteLock() throws IOException {
/* NOOP */
}
Does nothing by default. Subclasses that require a write lock for
index modifications must implement this method. |
public final synchronized void close() throws IOException {
if (!closed) {
decRef();
closed = true;
}
}
Closes files associated with this index.
Also saves any new deletions to disk.
No other methods should be called after this has been called. |
protected final synchronized void commit() throws IOException {
if(hasChanges){
doCommit();
}
hasChanges = false;
}
Commit changes resulting from delete, undeleteAll, or
setNorm operations
If an exception is hit, then either no changes or all
changes will have been committed to the index
(transactional semantics). |
protected synchronized void decRef() throws IOException {
assert refCount > 0;
if (refCount == 1) {
commit();
doClose();
}
refCount--;
}
Decreases the refCount of this IndexReader instance. If the refCount drops
to 0, then pending changes are committed to the index and this reader is closed. |
public final synchronized void deleteDocument(int docNum) throws IOException, StaleReaderException, CorruptIndexException, LockObtainFailedException {
ensureOpen();
acquireWriteLock();
hasChanges = true;
doDelete(docNum);
}
Deletes the document numbered docNum. Once a document is
deleted it will not appear in TermDocs or TermPostitions enumerations.
Attempts to read its field with the #document
method will result in an error. The presence of this document may still be
reflected in the #docFreq statistic, though
this will be corrected eventually as the index is further modified. |
public final int deleteDocuments(Term term) throws IOException, StaleReaderException, CorruptIndexException, LockObtainFailedException {
ensureOpen();
TermDocs docs = termDocs(term);
if (docs == null) return 0;
int n = 0;
try {
while (docs.next()) {
deleteDocument(docs.doc());
n++;
}
} finally {
docs.close();
}
return n;
}
Deletes all documents that have a given term indexed.
This is useful if one uses a document field to hold a unique ID string for
the document. Then to delete such a document, one merely constructs a
term with the appropriate field and the unique ID string as its text and
passes it to this method.
See #deleteDocument(int) for information about when this deletion will
become effective. |
public Directory directory() {
ensureOpen();
if (null != directory) {
return directory;
} else {
throw new UnsupportedOperationException("This reader does not support this method.");
}
}
Returns the directory associated with this index. The Default
implementation returns the directory specified by subclasses when
delegating to the IndexReader(Directory) constructor, or throws an
UnsupportedOperationException if one was not specified. |
abstract protected void doClose() throws IOException
|
abstract protected void doCommit() throws IOException
|
abstract protected void doDelete(int docNum) throws IOException, CorruptIndexException
|
abstract protected void doSetNorm(int doc,
String field,
byte value) throws IOException, CorruptIndexException
Implements setNorm in subclass. |
abstract protected void doUndeleteAll() throws IOException, CorruptIndexException
Implements actual undeleteAll() in subclass. |
abstract public int docFreq(Term t) throws IOException
Returns the number of documents containing the term t. |
public Document document(int n) throws IOException, CorruptIndexException {
ensureOpen();
return document(n, null);
}
Returns the stored fields of the nth
Document in this index. |
abstract public Document document(int n,
FieldSelector fieldSelector) throws IOException, CorruptIndexException
|
protected final void ensureOpen() throws AlreadyClosedException {
if (refCount < = 0) {
throw new AlreadyClosedException("this IndexReader is closed");
}
}
|
public final synchronized void flush() throws IOException {
ensureOpen();
commit();
}
|
public static long getCurrentVersion(String directory) throws IOException, CorruptIndexException {
return getCurrentVersion(new File(directory));
}
Reads version number from segments files. The version number is
initialized with a timestamp and then increased by one for each change of
the index. |
public static long getCurrentVersion(File directory) throws IOException, CorruptIndexException {
Directory dir = FSDirectory.getDirectory(directory);
long version = getCurrentVersion(dir);
dir.close();
return version;
}
Reads version number from segments files. The version number is
initialized with a timestamp and then increased by one for each change of
the index. |
public static long getCurrentVersion(Directory directory) throws IOException, CorruptIndexException {
return SegmentInfos.readCurrentVersion(directory);
}
Reads version number from segments files. The version number is
initialized with a timestamp and then increased by one for each change of
the index. |
abstract public Collection getFieldNames(IndexReader.FieldOption fldOption)
Get a list of unique field names that exist in this index and have the specified
field option information. |
synchronized int getRefCount() {
// for testing
return refCount;
}
|
abstract public TermFreqVector getTermFreqVector(int docNumber,
String field) throws IOException
Return a term frequency vector for the specified document and field. The
returned vector contains terms and frequencies for the terms in
the specified field of this document, if the field had the storeTermVector
flag set. If termvectors had been stored with positions or offsets, a
TermPositionsVector is returned. |
abstract public void getTermFreqVector(int docNumber,
TermVectorMapper mapper) throws IOException
Map all the term vectors for all fields in a Document |
abstract public void getTermFreqVector(int docNumber,
String field,
TermVectorMapper mapper) throws IOException
Load the Term Vector into a user-defined data structure instead of relying on the parallel arrays of
the TermFreqVector . |
abstract public TermFreqVector[] getTermFreqVectors(int docNumber) throws IOException
Return an array of term frequency vectors for the specified document.
The array contains a vector for each vectorized field in the document.
Each vector contains terms and frequencies for all terms in a given vectorized field.
If no such fields existed, the method returns null. The term vectors that are
returned my either be of type TermFreqVector or of type TermPositionsVector if
positions or offsets have been stored. |
public int getTermInfosIndexDivisor() {
throw new UnsupportedOperationException("This reader does not support this method.");
}
For IndexReader implementations that use
TermInfosReader to read terms, this returns the
current indexDivisor.
|
public long getVersion() {
throw new UnsupportedOperationException("This reader does not support this method.");
}
Version number when this IndexReader was opened. Not implemented in the IndexReader base class. |
abstract public boolean hasDeletions()
Returns true if any documents have been deleted |
public boolean hasNorms(String field) throws IOException {
// backward compatible implementation.
// SegmentReader has an efficient implementation.
ensureOpen();
return norms(field) != null;
}
Returns true if there are norms stored for this field. |
protected synchronized void incRef() {
assert refCount > 0;
refCount++;
}
Increments the refCount of this IndexReader instance. RefCounts are used to determine
when a reader can be closed safely, i. e. as soon as no other IndexReader is referencing
it anymore. |
public static boolean indexExists(String directory) {
return indexExists(new File(directory));
}
Returns true if an index exists at the specified directory.
If the directory does not exist or if there is no index in it.
false is returned. |
public static boolean indexExists(File directory) {
return SegmentInfos.getCurrentSegmentGeneration(directory.list()) != -1;
}
Returns true if an index exists at the specified directory.
If the directory does not exist or if there is no index in it. |
public static boolean indexExists(Directory directory) throws IOException {
return SegmentInfos.getCurrentSegmentGeneration(directory) != -1;
}
Returns true if an index exists at the specified directory.
If the directory does not exist or if there is no index in it. |
public boolean isCurrent() throws IOException, CorruptIndexException {
throw new UnsupportedOperationException("This reader does not support this method.");
}
|
abstract public boolean isDeleted(int n)
Returns true if document n has been deleted |
public static boolean isLocked(Directory directory) throws IOException {
return
directory.makeLock(IndexWriter.WRITE_LOCK_NAME).isLocked();
}
Returns true iff the index in the named directory is
currently locked. |
public static boolean isLocked(String directory) throws IOException {
Directory dir = FSDirectory.getDirectory(directory);
boolean result = isLocked(dir);
dir.close();
return result;
}
Returns true iff the index in the named directory is
currently locked. |
public boolean isOptimized() {
throw new UnsupportedOperationException("This reader does not support this method.");
}
Checks is the index is optimized (if it has a single segment and
no deletions). Not implemented in the IndexReader base class. |
public static long lastModified(String directory) throws IOException, CorruptIndexException {
return lastModified(new File(directory));
}
Returns the time the index in the named directory was last modified.
Do not use this to check whether the reader is still up-to-date, use
#isCurrent() instead. |
public static long lastModified(File fileDirectory) throws IOException, CorruptIndexException {
return ((Long) new SegmentInfos.FindSegmentsFile(fileDirectory) {
public Object doBody(String segmentFileName) {
return new Long(FSDirectory.fileModified(fileDirectory, segmentFileName));
}
}.run()).longValue();
}
Returns the time the index in the named directory was last modified.
Do not use this to check whether the reader is still up-to-date, use
#isCurrent() instead. |
public static long lastModified(Directory directory2) throws IOException, CorruptIndexException {
return ((Long) new SegmentInfos.FindSegmentsFile(directory2) {
public Object doBody(String segmentFileName) throws IOException {
return new Long(directory2.fileModified(segmentFileName));
}
}.run()).longValue();
}
Returns the time the index in the named directory was last modified.
Do not use this to check whether the reader is still up-to-date, use
#isCurrent() instead. |
public static void main(String[] args) {
String filename = null;
boolean extract = false;
for (int i = 0; i < args.length; ++i) {
if (args[i].equals("-extract")) {
extract = true;
} else if (filename == null) {
filename = args[i];
}
}
if (filename == null) {
System.out.println("Usage: org.apache.lucene.index.IndexReader [-extract] < cfsfile >");
return;
}
Directory dir = null;
CompoundFileReader cfr = null;
try {
File file = new File(filename);
String dirname = file.getAbsoluteFile().getParent();
filename = file.getName();
dir = FSDirectory.getDirectory(dirname);
cfr = new CompoundFileReader(dir, filename);
String [] files = cfr.list();
Arrays.sort(files); // sort the array of filename so that the output is more readable
for (int i = 0; i < files.length; ++i) {
long len = cfr.fileLength(files[i]);
if (extract) {
System.out.println("extract " + files[i] + " with " + len + " bytes to local directory...");
IndexInput ii = cfr.openInput(files[i]);
FileOutputStream f = new FileOutputStream(files[i]);
// read and write with a small buffer, which is more effectiv than reading byte by byte
byte[] buffer = new byte[1024];
int chunk = buffer.length;
while(len > 0) {
final int bufLen = (int) Math.min(chunk, len);
ii.readBytes(buffer, 0, bufLen);
f.write(buffer, 0, bufLen);
len -= bufLen;
}
f.close();
ii.close();
}
else
System.out.println(files[i] + ": " + len + " bytes");
}
} catch (IOException ioe) {
ioe.printStackTrace();
}
finally {
try {
if (dir != null)
dir.close();
if (cfr != null)
cfr.close();
}
catch (IOException ioe) {
ioe.printStackTrace();
}
}
}
Prints the filename and size of each file within a given compound file.
Add the -extract flag to extract files to the current working directory.
In order to make the extracted version of the index work, you have to copy
the segments file from the compound index into the directory where the extracted files are stored. |
abstract public int maxDoc()
Returns one greater than the largest possible document number.
This may be used to, e.g., determine how big to allocate an array which
will have an element for every document number in an index. |
abstract public byte[] norms(String field) throws IOException
Returns the byte-encoded normalization factor for the named field of
every document. This is used by the search code to score documents. |
abstract public void norms(String field,
byte[] bytes,
int offset) throws IOException
Reads the byte-encoded normalization factor for the named field of every
document. This is used by the search code to score documents. |
abstract public int numDocs()
Returns the number of documents in this index. |
public static IndexReader open(String path) throws IOException, CorruptIndexException {
return open(FSDirectory.getDirectory(path), true, null);
}
Returns an IndexReader reading the index in an FSDirectory in the named
path. |
public static IndexReader open(File path) throws IOException, CorruptIndexException {
return open(FSDirectory.getDirectory(path), true, null);
}
Returns an IndexReader reading the index in an FSDirectory in the named
path. |
public static IndexReader open(Directory directory) throws IOException, CorruptIndexException {
return open(directory, false, null);
}
Returns an IndexReader reading the index in the given Directory. |
public static IndexReader open(Directory directory,
IndexDeletionPolicy deletionPolicy) throws IOException, CorruptIndexException {
return open(directory, false, deletionPolicy);
}
Expert: returns an IndexReader reading the index in the given
Directory, with a custom IndexDeletionPolicy . |
public synchronized IndexReader reopen() throws IOException, CorruptIndexException {
throw new UnsupportedOperationException("This reader does not support reopen().");
}
Refreshes an IndexReader if the index has changed since this instance
was (re)opened.
Opening an IndexReader is an expensive operation. This method can be used
to refresh an existing IndexReader to reduce these costs. This method
tries to only load segments that have changed or were created after the
IndexReader was (re)opened.
If the index has not changed since this instance was (re)opened, then this
call is a NOOP and returns this instance. Otherwise, a new instance is
returned. The old instance is not closed and remains usable.
Note: The re-opened reader instance and the old instance might share
the same resources. For this reason no index modification operations
(e. g. #deleteDocument(int) , #setNorm(int, String, byte) )
should be performed using one of the readers until the old reader instance
is closed. Otherwise, the behavior of the readers is undefined.
You can determine whether a reader was actually reopened by comparing the
old instance with the instance returned by this method:
IndexReader reader = ...
...
IndexReader new = r.reopen();
if (new != reader) {
... // reader was reopened
reader.close();
}
reader = new;
...
|
public final synchronized void setNorm(int doc,
String field,
byte value) throws IOException, StaleReaderException, CorruptIndexException, LockObtainFailedException {
ensureOpen();
acquireWriteLock();
hasChanges = true;
doSetNorm(doc, field, value);
}
Expert: Resets the normalization factor for the named field of the named
document. The norm represents the product of the field's boost and its length normalization . Thus, to preserve the length normalization
values when resetting this, one should base the new value upon the old. |
public void setNorm(int doc,
String field,
float value) throws IOException, StaleReaderException, CorruptIndexException, LockObtainFailedException {
ensureOpen();
setNorm(doc, field, Similarity.encodeNorm(value));
}
Expert: Resets the normalization factor for the named field of the named
document. |
public void setTermInfosIndexDivisor(int indexDivisor) throws IllegalStateException {
throw new UnsupportedOperationException("This reader does not support this method.");
}
For IndexReader implementations that use
TermInfosReader to read terms, this sets the
indexDivisor to subsample the number of indexed terms
loaded into memory. This has the same effect as IndexWriter#setTermIndexInterval except that setting
must be done at indexing time while this setting can be
set per reader. When set to N, then one in every
N*termIndexInterval terms in the index is loaded into
memory. By setting this to a value > 1 you can reduce
memory usage, at the expense of higher latency when
loading a TermInfo. The default value is 1.
NOTE: you must call this before the term
index is loaded. If the index is already loaded,
an IllegalStateException is thrown.
|
abstract public TermDocs termDocs() throws IOException
Returns an unpositioned TermDocs enumerator. |
public TermDocs termDocs(Term term) throws IOException {
ensureOpen();
TermDocs termDocs = termDocs();
termDocs.seek(term);
return termDocs;
}
Returns an enumeration of all the documents which contain
term. For each document, the document number, the frequency of
the term in that document is also provided, for use in search scoring.
Thus, this method implements the mapping:
The enumeration is ordered by document number. Each document number
is greater than all that precede it in the enumeration. |
abstract public TermPositions termPositions() throws IOException
|
public TermPositions termPositions(Term term) throws IOException {
ensureOpen();
TermPositions termPositions = termPositions();
termPositions.seek(term);
return termPositions;
}
Returns an enumeration of all the documents which contain
term. For each document, in addition to the document number
and frequency of the term in that document, a list of all of the ordinal
positions of the term in the document is available. Thus, this method
implements the mapping:
Term => <docNum, freq,
<pos1, pos2, ...
posfreq-1>
>*
This positional information facilitates phrase and proximity searching.
The enumeration is ordered by document number. Each document number is
greater than all that precede it in the enumeration. |
abstract public TermEnum terms() throws IOException
Returns an enumeration of all the terms in the index. The
enumeration is ordered by Term.compareTo(). Each term is greater
than all that precede it in the enumeration. Note that after
calling terms(), TermEnum#next() must be called
on the resulting enumeration before calling other methods such as
TermEnum#term() . |
abstract public TermEnum terms(Term t) throws IOException
Returns an enumeration of all terms starting at a given term. If
the given term does not exist, the enumeration is positioned at the
first term greater than the supplied therm. The enumeration is
ordered by Term.compareTo(). Each term is greater than all that
precede it in the enumeration. |
public final synchronized void undeleteAll() throws IOException, StaleReaderException, CorruptIndexException, LockObtainFailedException {
ensureOpen();
acquireWriteLock();
hasChanges = true;
doUndeleteAll();
}
Undeletes all documents currently marked as deleted in this index. |
public static void unlock(Directory directory) throws IOException {
directory.makeLock(IndexWriter.WRITE_LOCK_NAME).release();
}
Forcibly unlocks the index in the named directory.
Caution: this should only be used by failure recovery code,
when it is known that no other process nor thread is in fact
currently accessing this index. |