Save This Page
Home » lucene-2.3.2-src » org.apache » lucene » analysis » standard » [javadoc | source]
org.apache.lucene.analysis.standard
public class: StandardAnalyzer [javadoc | source]
java.lang.Object
   org.apache.lucene.analysis.Analyzer
      org.apache.lucene.analysis.standard.StandardAnalyzer
Filters StandardTokenizer with StandardFilter , LowerCaseFilter and StopFilter , using a list of English stop words.
Field Summary
public static final  String[] STOP_WORDS    An array containing some common English words that are usually not useful for searching. 
public static final  int DEFAULT_MAX_TOKEN_LENGTH    Default maximum allowed token length 
Constructor:
 public StandardAnalyzer() 
 public StandardAnalyzer(Set stopWords) 
    Builds an analyzer with the given stop words.
 public StandardAnalyzer(String[] stopWords) 
    Builds an analyzer with the given stop words.
 public StandardAnalyzer(File stopwords) throws IOException 
    Builds an analyzer with the stop words from the given file.
    Also see:
    WordlistLoader#getWordSet(File)
 public StandardAnalyzer(Reader stopwords) throws IOException 
    Builds an analyzer with the stop words from the given reader.
    Also see:
    WordlistLoader#getWordSet(Reader)
 public StandardAnalyzer(boolean replaceInvalidAcronym) 
 public StandardAnalyzer(Reader stopwords,
    boolean replaceInvalidAcronym) throws IOException 
    Parameters:
    stopwords - The stopwords to use
    replaceInvalidAcronym - Set to true if this analyzer should replace mischaracterized acronyms in the StandardTokenizer See https://issues.apache.org/jira/browse/LUCENE-1068
 public StandardAnalyzer(File stopwords,
    boolean replaceInvalidAcronym) throws IOException 
    Parameters:
    stopwords - The stopwords to use
    replaceInvalidAcronym - Set to true if this analyzer should replace mischaracterized acronyms in the StandardTokenizer See https://issues.apache.org/jira/browse/LUCENE-1068
 public StandardAnalyzer(String[] stopwords,
    boolean replaceInvalidAcronym) throws IOException 
    Parameters:
    stopwords - The stopwords to use
    replaceInvalidAcronym - Set to true if this analyzer should replace mischaracterized acronyms in the StandardTokenizer See https://issues.apache.org/jira/browse/LUCENE-1068
 public StandardAnalyzer(Set stopwords,
    boolean replaceInvalidAcronym) throws IOException 
    Parameters:
    stopwords - The stopwords to use
    replaceInvalidAcronym - Set to true if this analyzer should replace mischaracterized acronyms in the StandardTokenizer See https://issues.apache.org/jira/browse/LUCENE-1068
Method from org.apache.lucene.analysis.standard.StandardAnalyzer Summary:
getMaxTokenLength,   isReplaceInvalidAcronym,   reusableTokenStream,   setMaxTokenLength,   setReplaceInvalidAcronym,   tokenStream
Methods from org.apache.lucene.analysis.Analyzer:
getPositionIncrementGap,   getPreviousTokenStream,   reusableTokenStream,   setPreviousTokenStream,   tokenStream
Methods from java.lang.Object:
equals,   getClass,   hashCode,   notify,   notifyAll,   toString,   wait,   wait,   wait
Method from org.apache.lucene.analysis.standard.StandardAnalyzer Detail:
 public int getMaxTokenLength() 
 public boolean isReplaceInvalidAcronym() 
 public TokenStream reusableTokenStream(String fieldName,
    Reader reader) throws IOException 
 public  void setMaxTokenLength(int length) 
    Set maximum allowed token length. If a token is seen that exceeds this length then it is discarded. This setting only takes effect the next time tokenStream or reusableTokenStream is called.
 public  void setReplaceInvalidAcronym(boolean replaceInvalidAcronym) 
 public TokenStream tokenStream(String fieldName,
    Reader reader)