|
|||||||||
| Home >> All >> org >> apache >> lucene >> [ analysis overview ] | PREV NEXT | ||||||||
A
- Analyzer - class org.apache.lucene.analysis.Analyzer.
- An Analyzer builds TokenStreams, which analyze text.
- Analyzer() - Constructor for class org.apache.lucene.analysis.Analyzer
- add(char) - Method in class org.apache.lucene.analysis.PorterStemmer
- Add a character to the word being stemmed.
- addAnalyzer(String, Analyzer) - Method in class org.apache.lucene.analysis.PerFieldAnalyzerWrapper
- Defines an analyzer to use for the specified field.
- analyzerMap - Variable in class org.apache.lucene.analysis.PerFieldAnalyzerWrapper
- assertAnalyzesTo(Analyzer, String, String[]) - Method in class org.apache.lucene.analysis.TestAnalyzers
B
- b - Variable in class org.apache.lucene.analysis.PorterStemmer
- buffer - Variable in class org.apache.lucene.analysis.CharTokenizer
- bufferIndex - Variable in class org.apache.lucene.analysis.CharTokenizer
C
- CharTokenizer - class org.apache.lucene.analysis.CharTokenizer.
- An abstract base class for simple, character-oriented tokenizers.
- CharTokenizer(Reader) - Constructor for class org.apache.lucene.analysis.CharTokenizer
- close() - Method in class org.apache.lucene.analysis.TokenFilter
- Close the input TokenStream.
- close() - Method in class org.apache.lucene.analysis.TokenStream
- Releases resources associated with this stream.
- close() - Method in class org.apache.lucene.analysis.Tokenizer
- By default, closes the input Reader.
- cons(int) - Method in class org.apache.lucene.analysis.PorterStemmer
- cvc(int) - Method in class org.apache.lucene.analysis.PorterStemmer
D
- dataLen - Variable in class org.apache.lucene.analysis.CharTokenizer
- defaultAnalyzer - Variable in class org.apache.lucene.analysis.PerFieldAnalyzerWrapper
- dirty - Variable in class org.apache.lucene.analysis.PorterStemmer
- doublec(int) - Method in class org.apache.lucene.analysis.PorterStemmer
E
- ENGLISH_STOP_WORDS - Static variable in class org.apache.lucene.analysis.StopAnalyzer
- An array containing some common English words that are not usually useful for searching.
- EXTRA - Static variable in class org.apache.lucene.analysis.PorterStemmer
- endOffset - Variable in class org.apache.lucene.analysis.Token
- endOffset() - Method in class org.apache.lucene.analysis.Token
- Returns this Token's ending offset, one greater than the position of the last character corresponding to this token in the source text.
- ends(String) - Method in class org.apache.lucene.analysis.PorterStemmer
G
- getPositionIncrement() - Method in class org.apache.lucene.analysis.Token
- Returns the position increment of this Token.
- getResultBuffer() - Method in class org.apache.lucene.analysis.PorterStemmer
- Returns a reference to a character buffer containing the results of the stemming process.
- getResultLength() - Method in class org.apache.lucene.analysis.PorterStemmer
- Returns the length of the word resulting from the stemming process.
I
- INC - Static variable in class org.apache.lucene.analysis.PorterStemmer
- IO_BUFFER_SIZE - Static variable in class org.apache.lucene.analysis.CharTokenizer
- i - Variable in class org.apache.lucene.analysis.PorterStemmer
- inValidTokens - Variable in class org.apache.lucene.analysis.TestStopAnalyzer
- input - Variable in class org.apache.lucene.analysis.TokenFilter
- The source of tokens for this filter.
- input - Variable in class org.apache.lucene.analysis.Tokenizer
- The text source for this Tokenizer.
- ioBuffer - Variable in class org.apache.lucene.analysis.CharTokenizer
- isTokenChar(char) - Method in class org.apache.lucene.analysis.CharTokenizer
- Returns true iff a character should be included in a token.
- isTokenChar(char) - Method in class org.apache.lucene.analysis.LetterTokenizer
- Collects only characters which satisfy
Character.isLetter(char)>
Character.isLetter(char)55 . - isTokenChar(char) - Method in class org.apache.lucene.analysis.WhitespaceTokenizer
- Collects only characters which do not satisfy
Character.isWhitespace(char)>
Character.isWhitespace(char)55 .
J
- j - Variable in class org.apache.lucene.analysis.PorterStemmer
K
- k - Variable in class org.apache.lucene.analysis.PorterStemmer
- k0 - Variable in class org.apache.lucene.analysis.PorterStemmer
L
- LetterTokenizer - class org.apache.lucene.analysis.LetterTokenizer.
- A LetterTokenizer is a tokenizer that divides text at non-letters.
- LetterTokenizer(Reader) - Constructor for class org.apache.lucene.analysis.LetterTokenizer
- Construct a new LetterTokenizer.
- LowerCaseFilter - class org.apache.lucene.analysis.LowerCaseFilter.
- Normalizes token text to lower case.
- LowerCaseFilter(TokenStream) - Constructor for class org.apache.lucene.analysis.LowerCaseFilter
- LowerCaseTokenizer - class org.apache.lucene.analysis.LowerCaseTokenizer.
- LowerCaseTokenizer performs the function of LetterTokenizer and LowerCaseFilter together.
- LowerCaseTokenizer(Reader) - Constructor for class org.apache.lucene.analysis.LowerCaseTokenizer
- Construct a new LowerCaseTokenizer.
M
- MAX_WORD_LEN - Static variable in class org.apache.lucene.analysis.CharTokenizer
- m() - Method in class org.apache.lucene.analysis.PorterStemmer
- main(String[]) - Static method in class org.apache.lucene.analysis.PorterStemmer
- Test program for demonstrating the Stemmer.
- makeStopSet(String[]) - Static method in class org.apache.lucene.analysis.StopFilter
- Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor.
- makeStopTable(String[]) - Static method in class org.apache.lucene.analysis.StopFilter
- Deprecated. Use
StopFilter.makeStopSet(String[])55 instead.
N
- next() - Method in class org.apache.lucene.analysis.CharTokenizer
- Returns the next token in the stream, or null at EOS.
- next() - Method in class org.apache.lucene.analysis.LowerCaseFilter
- next() - Method in class org.apache.lucene.analysis.PorterStemFilter
- Returns the next input Token, after being stemmed
- next() - Method in class org.apache.lucene.analysis.StopFilter
- Returns the next input Token whose termText() is not a stop word.
- next() - Method in class org.apache.lucene.analysis.TokenStream
- Returns the next token in the stream, or null at EOS.
- normalize(char) - Method in class org.apache.lucene.analysis.CharTokenizer
- Called on each token character to normalize it before it is added to the token.
- normalize(char) - Method in class org.apache.lucene.analysis.LowerCaseTokenizer
- Collects only characters which satisfy
Character.isLetter(char)>
Character.isLetter(char)55 .
O
- offset - Variable in class org.apache.lucene.analysis.CharTokenizer
- org.apache.lucene.analysis - package org.apache.lucene.analysis
P
- PerFieldAnalyzerWrapper - class org.apache.lucene.analysis.PerFieldAnalyzerWrapper.
- This analyzer is used to facilitate scenarios where different fields require different analysis techniques.
- PerFieldAnalyzerWrapper(Analyzer) - Constructor for class org.apache.lucene.analysis.PerFieldAnalyzerWrapper
- Constructs with default analyzer.
- PorterStemFilter - class org.apache.lucene.analysis.PorterStemFilter.
- Transforms the token stream as per the Porter stemming algorithm.
- PorterStemFilter(TokenStream) - Constructor for class org.apache.lucene.analysis.PorterStemFilter
- PorterStemmer - class org.apache.lucene.analysis.PorterStemmer.
- Stemmer, implementing the Porter Stemming Algorithm The Stemmer class transforms a word into its root form.
- PorterStemmer() - Constructor for class org.apache.lucene.analysis.PorterStemmer
- positionIncrement - Variable in class org.apache.lucene.analysis.Token
R
- r(String) - Method in class org.apache.lucene.analysis.PorterStemmer
- reset() - Method in class org.apache.lucene.analysis.PorterStemmer
- reset() resets the stemmer so it can stem another word.
S
- SimpleAnalyzer - class org.apache.lucene.analysis.SimpleAnalyzer.
- An Analyzer that filters LetterTokenizer with LowerCaseFilter.
- SimpleAnalyzer() - Constructor for class org.apache.lucene.analysis.SimpleAnalyzer
- StopAnalyzer - class org.apache.lucene.analysis.StopAnalyzer.
- Filters LetterTokenizer with LowerCaseFilter and StopFilter.
- StopAnalyzer() - Constructor for class org.apache.lucene.analysis.StopAnalyzer
- Builds an analyzer which removes words in ENGLISH_STOP_WORDS.
- StopAnalyzer(String[]) - Constructor for class org.apache.lucene.analysis.StopAnalyzer
- Builds an analyzer which removes words in the provided array.
- StopFilter - class org.apache.lucene.analysis.StopFilter.
- Removes stop words from a token stream.
- StopFilter(TokenStream, String[]) - Constructor for class org.apache.lucene.analysis.StopFilter
- Constructs a filter which removes words from the input TokenStream that are named in the array of words.
- StopFilter(TokenStream, Hashtable) - Constructor for class org.apache.lucene.analysis.StopFilter
- Deprecated. Use
StopFilter.StopFilter(TokenStream, Set)55 instead - StopFilter(TokenStream, Set) - Constructor for class org.apache.lucene.analysis.StopFilter
- Constructs a filter which removes words from the input TokenStream that are named in the Set.
- setPositionIncrement(int) - Method in class org.apache.lucene.analysis.Token
- Set the position increment.
- setUp() - Method in class org.apache.lucene.analysis.TestStopAnalyzer
- setto(String) - Method in class org.apache.lucene.analysis.PorterStemmer
- startOffset - Variable in class org.apache.lucene.analysis.Token
- startOffset() - Method in class org.apache.lucene.analysis.Token
- Returns this Token's starting offset, the position of the first character corresponding to this token in the source text.
- stem(String) - Method in class org.apache.lucene.analysis.PorterStemmer
- Stem a word provided as a String.
- stem(char[]) - Method in class org.apache.lucene.analysis.PorterStemmer
- Stem a word contained in a char[].
- stem(char[], int, int) - Method in class org.apache.lucene.analysis.PorterStemmer
- Stem a word contained in a portion of a char[] array.
- stem(char[], int) - Method in class org.apache.lucene.analysis.PorterStemmer
- Stem a word contained in a leading portion of a char[] array.
- stem() - Method in class org.apache.lucene.analysis.PorterStemmer
- Stem the word placed into the Stemmer buffer through calls to add().
- stem(int) - Method in class org.apache.lucene.analysis.PorterStemmer
- stemmer - Variable in class org.apache.lucene.analysis.PorterStemFilter
- step1() - Method in class org.apache.lucene.analysis.PorterStemmer
- step2() - Method in class org.apache.lucene.analysis.PorterStemmer
- step3() - Method in class org.apache.lucene.analysis.PorterStemmer
- step4() - Method in class org.apache.lucene.analysis.PorterStemmer
- step5() - Method in class org.apache.lucene.analysis.PorterStemmer
- step6() - Method in class org.apache.lucene.analysis.PorterStemmer
- stop - Variable in class org.apache.lucene.analysis.TestStopAnalyzer
- stopWords - Variable in class org.apache.lucene.analysis.StopAnalyzer
- stopWords - Variable in class org.apache.lucene.analysis.StopFilter
T
- TestAnalyzers - class org.apache.lucene.analysis.TestAnalyzers.
- TestAnalyzers(String) - Constructor for class org.apache.lucene.analysis.TestAnalyzers
- TestPerFieldAnalzyerWrapper - class org.apache.lucene.analysis.TestPerFieldAnalzyerWrapper.
- Copyright 2004 The Apache Software Foundation Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.
- TestPerFieldAnalzyerWrapper() - Constructor for class org.apache.lucene.analysis.TestPerFieldAnalzyerWrapper
- TestStopAnalyzer - class org.apache.lucene.analysis.TestStopAnalyzer.
- TestStopAnalyzer(String) - Constructor for class org.apache.lucene.analysis.TestStopAnalyzer
- Token - class org.apache.lucene.analysis.Token.
- A Token is an occurence of a term from the text of a field.
- Token(String, int, int) - Constructor for class org.apache.lucene.analysis.Token
- Constructs a Token with the given term text, and start & end offsets.
- Token(String, int, int, String) - Constructor for class org.apache.lucene.analysis.Token
- Constructs a Token with the given text, start and end offsets, & type.
- TokenFilter - class org.apache.lucene.analysis.TokenFilter.
- A TokenFilter is a TokenStream whose input is another token stream.
- TokenFilter() - Constructor for class org.apache.lucene.analysis.TokenFilter
- Deprecated.
- TokenFilter(TokenStream) - Constructor for class org.apache.lucene.analysis.TokenFilter
- Construct a token stream filtering the given input.
- TokenStream - class org.apache.lucene.analysis.TokenStream.
- A TokenStream enumerates the sequence of tokens, either from fields of a document or from query text.
- TokenStream() - Constructor for class org.apache.lucene.analysis.TokenStream
- Tokenizer - class org.apache.lucene.analysis.Tokenizer.
- A Tokenizer is a TokenStream whose input is a Reader.
- Tokenizer() - Constructor for class org.apache.lucene.analysis.Tokenizer
- Construct a tokenizer with null input.
- Tokenizer(Reader) - Constructor for class org.apache.lucene.analysis.Tokenizer
- Construct a token stream processing the given input.
- termText - Variable in class org.apache.lucene.analysis.Token
- termText() - Method in class org.apache.lucene.analysis.Token
- Returns the Token's term text.
- testDefaults() - Method in class org.apache.lucene.analysis.TestStopAnalyzer
- testNull() - Method in class org.apache.lucene.analysis.TestAnalyzers
- testPerField() - Method in class org.apache.lucene.analysis.TestPerFieldAnalzyerWrapper
- testSimple() - Method in class org.apache.lucene.analysis.TestAnalyzers
- testStop() - Method in class org.apache.lucene.analysis.TestAnalyzers
- testStopList() - Method in class org.apache.lucene.analysis.TestStopAnalyzer
- toString() - Method in class org.apache.lucene.analysis.PorterStemmer
- After a word has been stemmed, it can be retrieved by toString(), or a reference to the internal buffer can be retrieved by getResultBuffer and getResultLength (which is generally more efficient.)
- tokenStream(String, Reader) - Method in class org.apache.lucene.analysis.Analyzer
- Creates a TokenStream which tokenizes all the text in the provided Reader.
- tokenStream(Reader) - Method in class org.apache.lucene.analysis.Analyzer
- Deprecated. use tokenStream(String, Reader) instead.
- tokenStream(String, Reader) - Method in class org.apache.lucene.analysis.PerFieldAnalyzerWrapper
- tokenStream(String, Reader) - Method in class org.apache.lucene.analysis.SimpleAnalyzer
- tokenStream(String, Reader) - Method in class org.apache.lucene.analysis.StopAnalyzer
- Filters LowerCaseTokenizer with StopFilter.
- tokenStream(String, Reader) - Method in class org.apache.lucene.analysis.WhitespaceAnalyzer
- type - Variable in class org.apache.lucene.analysis.Token
- type() - Method in class org.apache.lucene.analysis.Token
- Returns this Token's lexical type.
V
- vowelinstem() - Method in class org.apache.lucene.analysis.PorterStemmer
W
- WhitespaceAnalyzer - class org.apache.lucene.analysis.WhitespaceAnalyzer.
- An Analyzer that uses WhitespaceTokenizer.
- WhitespaceAnalyzer() - Constructor for class org.apache.lucene.analysis.WhitespaceAnalyzer
- WhitespaceTokenizer - class org.apache.lucene.analysis.WhitespaceTokenizer.
- A WhitespaceTokenizer is a tokenizer that divides text at whitespace.
- WhitespaceTokenizer(Reader) - Constructor for class org.apache.lucene.analysis.WhitespaceTokenizer
- Construct a new WhitespaceTokenizer.
A B C D E G I J K L M N O P R S T V W
|
|||||||||
| Home >> All >> org >> apache >> lucene >> [ analysis overview ] | PREV NEXT | ||||||||