Docjar: A Java Source and Docuemnt Enginecom.*    java.*    javax.*    org.*    all    new    plug-in

Quick Search    Search Deep

org.apache.lucene.analysis.* (45)org.apache.lucene.demo.* (17)org.apache.lucene.document.* (4)
org.apache.lucene.index.* (54)org.apache.lucene.queryParser.* (10)org.apache.lucene.search.* (91)
org.apache.lucene.store.* (10)org.apache.lucene.util.* (8)

org.apache.lucene: Javadoc index of package org.apache.lucene.


Package Samples:

org.apache.lucene.analysis.standard: API and code to convert text into indexable tokens.  
org.apache.lucene.search.spans: Search over indices.  
org.apache.lucene.analysis
org.apache.lucene.demo.html
org.apache.lucene.demo
org.apache.lucene.analysis.ru
org.apache.lucene.document
org.apache.lucene.index
org.apache.lucene.queryParser
org.apache.lucene.search
org.apache.lucene.store
org.apache.lucene.util
org.apache.lucene.index.store
org.apache.lucene.analysis.cn

Classes:

Sort: Encapsulates sort criteria for returned hits. The fields used to determine sort order must be carefully chosen. Documents must contain a single term in such a field, and the value of the term should indicate the document's relative position in a given sort order. The field must be indexed, but should not be tokenized, and does not need to be stored (unless you happen to want it back with the rest of your document data). In other words: document.add (new Field ("byNumber", Integer.toString(x), false, true, false)); Valid Types of Values There are three possible kinds of term values which may be ...
Query: The abstract base class for queries. Instantiable subclasses are: TermQuery MultiTermQuery BooleanQuery WildcardQuery PhraseQuery PrefixQuery PhrasePrefixQuery FuzzyQuery RangeQuery org.apache.lucene.search.spans.SpanQuery A parser for queries is contained in: QueryParser
MultiTermQuery: A Query that matches documents containing a subset of terms provided by a FilteredTermEnum enumeration. MultiTermQuery is not designed to be used by itself. The reason being that it is not intialized with a FilteredTermEnum enumeration. A FilteredTermEnum enumeration needs to be provided. For example, WildcardQuery and FuzzyQuery extend MultiTermQuery to provide WildcardTermEnum and FuzzyTermEnum , respectively.
Similarity: Expert: Scoring API. Subclasses implement search scoring. The score of query q for document d is defined in terms of these methods as follows: score(q,d) = Σ tf 55 (t in d) * idf 55 (t) * getBoost 55 (t.field in d) * lengthNorm 55 (t.field in d)  * coord 55 (q,d) * queryNorm 55 (q) t in q
QueryParser: This class is generated by JavaCC. The only method that clients should need to call is parse() . The syntax for query strings is as follows: A Query is a series of clauses. A clause may be prefixed by: a plus ( + ) or a minus ( - ) sign, indicating that the clause is required or prohibited respectively; or a term followed by a colon, indicating the field to be searched. This enables one to construct queries which search multiple fields. A clause may be either: a term, indicating all the documents that contain this term; or a nested query, enclosed in parentheses. Note that this may be used with ...
Document: Documents are the unit of indexing and search. A Document is a set of fields. Each field has a name and a textual value. A field may be stored 55 with the document, in which case it is returned with search hits on the document. Thus each document should typically contain one or more stored fields which uniquely identify it. Note that fields which are not stored 55 are not available in documents retrieved from the index, e.g. with Hits.doc(int) > Hits.doc(int) 55 , Searchable.doc(int) > Searchable.doc(int) 55 or IndexReader.document(int) > IndexReader.document(int) 55 .
Weight: Expert: Calculate query weights and build query scorers. A Weight is constructed by a query, given a Searcher ( Query.createWeight(Searcher) 55 ). The sumOfSquaredWeights() 55 method is then called on the top-level query to compute the query normalization factor (@link Similarity#queryNorm(float)}). This factor is then passed to normalize(float) 55 . At this point the weighting is complete and a scorer may be constructed by calling scorer(IndexReader) 55 .
DateField: Provides support for converting dates to strings and vice-versa. The strings are structured so that lexicographic sorting orders by date, which makes them suitable for use as field values and search terms. Note that you do not have to use this class, you can just save your dates as strings if lexicographic sorting orders them by date. This is the case for example for dates like yyyy-mm-dd hh:mm:ss (of course you can leave out the delimiter characters to save some space). The advantage with using such a format is that you can easily save dates with the required granularity, e.g. leaving out seconds. ...
IndexReader: IndexReader is an abstract class, providing an interface for accessing an index. Search of an index is done entirely through this abstract interface, so that any subclass which implements it is searchable. Concrete subclasses of IndexReader are usually constructed with a call to the static method open(java.lang.String) 55 . For efficiency, in this API documents are often referred to via document numbers , non-negative integers which each name a unique document in the index. These document numbers are ephemeral--they may change as documents are added to and deleted from an index. Clients should ...
PorterStemFilter: Transforms the token stream as per the Porter stemming algorithm. Note: the input to the stemming filter must already be in lower case, so you will need to use LowerCaseFilter or LowerCaseTokenizer farther down the Tokenizer chain in order for this to work properly! To use this filter with other analyzers, you'll want to write an Analyzer class that sets up the TokenStream chain as you want it. To use this with LowerCaseTokenizer, for example, you'd write an analyzer like this: class MyAnalyzer extends Analyzer { public final TokenStream tokenStream(String fieldName, Reader reader) { return new ...
CompoundFileWriter: Combines multiple files into a single compound file. The file format: VInt fileCount {Directory} fileCount entries with the following structure: long dataOffset UTFString extension {File Data} fileCount entries with the raw data of the corresponding file The fileCount integer indicates how many files are contained in this compound file. The {directory} that follows has that many entries. Each directory entry contains an encoding identifier, an long pointer to the start of this file's data section, and a UTF String with that file's extension.
Token: A Token is an occurence of a term from the text of a field. It consists of a term's text, the start and end offset of the term in the text of the field, and a type string. The start and end offsets permit applications to re-associate a token with its source text, e.g., to display highlighted query terms in a document browser, or to show matching text fragments in a KWIC (KeyWord In Context) display, etc. The type is an interned string, assigned by a lexical analyzer (a.k.a. tokenizer), naming the lexical or syntactic class that the token belongs to. For example an end of sentence marker token might ...
CharStream: This interface describes a character stream that maintains line and column number positions of the characters. It also has the capability to backup the stream to some extent. An implementation of this interface is used in the TokenManager implementation generated by JavaCCParser. All the methods except backup can be implemented in any fashion. backup needs to be implemented correctly for the correct operation of the lexer. Rest of the methods are all used to get information like line number, column number and the String that constitutes a token and are not used by the lexer. Hence their implementation ...
CharStream: This interface describes a character stream that maintains line and column number positions of the characters. It also has the capability to backup the stream to some extent. An implementation of this interface is used in the TokenManager implementation generated by JavaCCParser. All the methods except backup can be implemented in any fashion. backup needs to be implemented correctly for the correct operation of the lexer. Rest of the methods are all used to get information like line number, column number and the String that constitutes a token and are not used by the lexer. Hence their implementation ...
FieldDoc: Expert: A ScoreDoc which also contains information about how to sort the referenced document. In addition to the document number and score, this object contains an array of values for the document from the field(s) used to sort. For example, if the sort criteria was to sort by fields "a", "b" then "c", the fields object array will have three elements, corresponding respectively to the term values for the document in fields "a", "b" and "c". The class of each element in the array will be either Integer, Float or String depending on the type of values in the terms of each field. Created: Feb 11, ...
IndexWriter: An IndexWriter creates and maintains an index. The third argument to the constructor determines whether a new index is created, or whether an existing index is opened for the addition of new documents. In either case, documents are added with the addDocument method. When finished adding documents, close should be called. If an index will not have more documents added for a while and optimal search performance is desired, then the optimize method should be called before the index is closed.
QueryFilter: Constrains search results to only match those which also match a provided query. Results are cached, so that searches after the first on the same index using this filter are much faster. This could be used, for example, with a RangeQuery on a suitably formatted date field to implement date filtering. One could re-use a single QueryFilter that matches, e.g., only documents modified within the last week. The QueryFilter and RangeQuery would only need to be reconstructed once per day.
TestPerFieldAnalzyerWrapper: Copyright 2004 The Apache Software Foundation Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
FieldInfo: Copyright 2004 The Apache Software Foundation Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
English: Copyright 2004 The Apache Software Foundation Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
SampleComparable: An example Comparable for use with the custom sort tests. It implements a comparable for "id" sort of values which consist of an alphanumeric part and a numeric part, such as: ABC-123, A-1, A-7, A-100, B-99999 Such values cannot be sorted as strings, since A-100 needs to come after A-7. It could be argued that the "ids" should be rewritten as A-0001, A-0100, etc. so they will sort as strings. That is a valid alternate way to solve it - but this is only supposed to be a simple test case. Created: Apr 21, 2004 5:34:47 PM
SortComparator: Abstract base class for sorting hits returned by a Query. This class should only be used if the other SortField types (SCORE, DOC, STRING, INT, FLOAT) do not provide an adequate sorting. It maintains an internal cache of values which could be quite large. The cache is an array of Comparable, one for each document in the index. There is a distinct Comparable for each unique term in the field - if some documents have the same term in the field, the cache array will have entries which reference the same Comparable. Created: Apr 21, 2004 5:08:38 PM
StandardAnalyzer: Filters StandardTokenizer with StandardFilter , org.apache.lucene.analysis.LowerCaseFilter and org.apache.lucene.analysis.StopFilter .
TokenStream: A TokenStream enumerates the sequence of tokens, either from fields of a document or from query text. This is an abstract class. Concrete subclasses are: Tokenizer , a TokenStream whose input is a Reader; and TokenFilter , a TokenStream whose input is another TokenStream.
FilterIndexReader: A FilterIndexReader contains another IndexReader, which it uses as its basic source of data, possibly transforming the data along the way or providing additional functionality. The class FilterIndexReader itself simply implements all abstract methods of IndexReader with versions that pass all requests to the contained index reader. Subclasses of FilterIndexReader may further override some of these methods and may also provide additional methods and fields.

Home | Contact Us | Privacy Policy | Terms of Service