Home » lucene-2.3.2-src » org.apache » lucene » analysis »

org.apache.lucene.analysis

Sub Packages:

org.apache.lucene.analysis.br   Analyzer for Brazilian.  
org.apache.lucene.analysis.cjk   Analyzer for Chinese, Japanese and Korean.  
org.apache.lucene.analysis.cn   Analyzer for Chinese.  
org.apache.lucene.analysis.cz   Analyzer for Czech.  
org.apache.lucene.analysis.de   Analyzer for German.  
org.apache.lucene.analysis.el   Analyzer for Greek.  
org.apache.lucene.analysis.fr   Analyzer for French.  
org.apache.lucene.analysis.ngram    
org.apache.lucene.analysis.nl   Analyzer for Dutch.  
org.apache.lucene.analysis.payloads   Provides various convenience classes for creating payloads on Tokens.  
org.apache.lucene.analysis.ru   Analyzer for Russian.  
org.apache.lucene.analysis.sinks   Implementations of the SinkTokenizer that might be useful.  
org.apache.lucene.analysis.snowball   org.apache.lucene.analysis.TokenFilter and org.apache.lucene.analysis.Analyzer implementations that use Snowball stemmers.  
org.apache.lucene.analysis.standard   A fast grammar-based tokenizer constructed with JFlex.  
org.apache.lucene.analysis.th    

Abstract Classes:

Analyzer   An Analyzer builds TokenStreams, which analyze text.  code | html
CharTokenizer   An abstract base class for simple, character-oriented tokenizers.  code | html
TokenFilter   A TokenFilter is a TokenStream whose input is another token stream.  code | html
TokenStream   A TokenStream enumerates the sequence of tokens, either from fields of a document or from query text.  code | html
Tokenizer   A Tokenizer is a TokenStream whose input is a Reader.  code | html

Classes:

CachingTokenFilter   This class can be used if the Tokens of a TokenStream are intended to be consumed more than once.  code | html
CharArraySet   A simple class that stores Strings as char[]'s in a hash table.  code | html
CharArraySet.CharArraySetIterator   The Iterator for this set.  code | html
ISOLatin1AccentFilter   A filter that replaces accented characters in the ISO Latin 1 character set (ISO-8859-1) by their unaccented equivalent.  code | html
KeywordAnalyzer   "Tokenizes" the entire stream as a single token.  code | html
KeywordTokenizer   Emits the entire input as a single token.  code | html
LengthFilter   Removes words that are too long and too short from the stream.  code | html
LetterTokenizer   A LetterTokenizer is a tokenizer that divides text at non-letters.  code | html
LowerCaseFilter   Normalizes token text to lower case.  code | html
LowerCaseTokenizer   LowerCaseTokenizer performs the function of LetterTokenizer and LowerCaseFilter together.  code | html
PerFieldAnalyzerWrapper   This analyzer is used to facilitate scenarios where different fields require different analysis techniques.  code | html
PorterStemFilter   Transforms the token stream as per the Porter stemming algorithm.  code | html
PorterStemmer   Stemmer, implementing the Porter Stemming Algorithm The Stemmer class transforms a word into its root form.  code | html
SimpleAnalyzer   An Analyzer that filters LetterTokenizer with LowerCaseFilter.  code | html
SinkTokenizer   A SinkTokenizer can be used to cache Tokens for use in an Analyzer  code | html
StopAnalyzer   Filters LetterTokenizer with LowerCaseFilter and StopFilter.  code | html
StopAnalyzer.SavedStreams   Filters LowerCaseTokenizer with StopFilter.  code | html
StopFilter   Removes stop words from a token stream.  code | html
TeeTokenFilter   Works in conjunction with the SinkTokenizer to provide the ability to set aside tokens that have already been analyzed.  code | html
Token   A Token is an occurence of a term from the text of a field.  code | html
WhitespaceAnalyzer   An Analyzer that uses WhitespaceTokenizer.  code | html
WhitespaceTokenizer   A WhitespaceTokenizer is a tokenizer that divides text at whitespace.  code | html
WordlistLoader   Loader for text files that represent a list of stopwords.  code | html

All Test Cases:

TestAnalyzers   Copyright 2004 The Apache Software Foundation Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.  code | html
TestPerFieldAnalzyerWrapper   Copyright 2004 The Apache Software Foundation Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.  code | html
TestStopAnalyzer   Copyright 2004 The Apache Software Foundation Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.  code | html