Save This Page
Home » lucene-2.3.2-src » org.apache » lucene » analysis » [javadoc | source]
org.apache.lucene.analysis
abstract public class: TokenStream [javadoc | source]
java.lang.Object
   org.apache.lucene.analysis.TokenStream

Direct Known Subclasses:
    ElisionFilter, EdgeNGramTokenFilter, LengthFilter, SnowballFilter, TokenTypeSinkTokenizer, TokenFilter, SynonymTokenFilter, StandardFilter, QPTestFilter, Tokenizer, TokenRangeSinkTokenizer, GreekLowerCaseFilter, NGramTokenFilter, LowerCaseFilter, NGramTokenizer, ChineseTokenizer, EdgeNGramTokenizer, RussianLetterTokenizer, WhitespaceTokenizer, TokenOffsetPayloadTokenFilter, LetterTokenizer, NumericPayloadTokenFilter, WikipediaTokenizer, DateRecognizerSinkTokenizer, RussianLowerCaseFilter, ISOLatin1AccentFilter, DutchStemFilter, CachingTokenFilter, TeeTokenFilter, StandardTokenizer, PorterStemFilter, FrenchStemFilter, BrazilianStemFilter, LowerCaseTokenizer, SinkTokenizer, TypeAsPayloadTokenFilter, GermanStemFilter, RussianStemFilter, KeywordTokenizer, PatternTokenizer, StopFilter, ThaiWordFilter, CJKTokenizer, FastStringTokenizer, ChineseFilter, CharTokenizer

A TokenStream enumerates the sequence of tokens, either from fields of a document or from query text.

This is an abstract class. Concrete subclasses are:

NOTE: subclasses must override at least one of #next() or #next(Token) .
Method from org.apache.lucene.analysis.TokenStream Summary:
close,   next,   next,   reset
Methods from java.lang.Object:
equals,   getClass,   hashCode,   notify,   notifyAll,   toString,   wait,   wait,   wait
Method from org.apache.lucene.analysis.TokenStream Detail:
 public  void close() throws IOException 
    Releases resources associated with this stream.
 public Token next() throws IOException 
    Returns the next token in the stream, or null at EOS. The returned Token is a "full private copy" (not re-used across calls to next()) but will be slower than calling #next(Token) instead..
 public Token next(Token result) throws IOException 
    Returns the next token in the stream, or null at EOS. When possible, the input Token should be used as the returned Token (this gives fastest tokenization performance), but this is not required and a new Token may be returned. Callers may re-use a single Token instance for successive calls to this method.

    This implicitly defines a "contract" between consumers (callers of this method) and producers (implementations of this method that are the source for tokens):

    • A consumer must fully consume the previously returned Token before calling this method again.
    • A producer must call Token#clear() before setting the fields in it & returning it
    Note that a TokenFilter is considered a consumer.
 public  void reset() throws IOException 
    Resets this stream to the beginning. This is an optional operation, so subclasses may or may not implement this method. Reset() is not needed for the standard indexing process. However, if the Tokens of a TokenStream are intended to be consumed more than once, it is necessary to implement reset().