Save This Page
Home » lucene-2.4.1-src » org.apache » lucene » analysis » standard » [javadoc | source]
org.apache.lucene.analysis.standard
public class: StandardAnalyzer [javadoc | source]
java.lang.Object
   org.apache.lucene.analysis.Analyzer
      org.apache.lucene.analysis.standard.StandardAnalyzer
Filters StandardTokenizer with StandardFilter , LowerCaseFilter and StopFilter , using a list of English stop words.
Field Summary
public static final  String[] STOP_WORDS    An array containing some common English words that are usually not useful for searching. 
public static final  int DEFAULT_MAX_TOKEN_LENGTH    Default maximum allowed token length 
Constructor:
 public StandardAnalyzer() 
 public StandardAnalyzer(Set stopWords) 
    Builds an analyzer with the given stop words.
 public StandardAnalyzer(String[] stopWords) 
    Builds an analyzer with the given stop words.
 public StandardAnalyzer(File stopwords) throws IOException 
    Builds an analyzer with the stop words from the given file.
    Also see:
    WordlistLoader#getWordSet(File)
 public StandardAnalyzer(Reader stopwords) throws IOException 
    Builds an analyzer with the stop words from the given reader.
    Also see:
    WordlistLoader#getWordSet(Reader)
 public StandardAnalyzer(boolean replaceInvalidAcronym) 
 public StandardAnalyzer(Reader stopwords,
    boolean replaceInvalidAcronym) throws IOException 
    Parameters:
    stopwords - The stopwords to use
    replaceInvalidAcronym - Set to true if this analyzer should replace mischaracterized acronyms in the StandardTokenizer See https://issues.apache.org/jira/browse/LUCENE-1068
 public StandardAnalyzer(File stopwords,
    boolean replaceInvalidAcronym) throws IOException 
    Parameters:
    stopwords - The stopwords to use
    replaceInvalidAcronym - Set to true if this analyzer should replace mischaracterized acronyms in the StandardTokenizer See https://issues.apache.org/jira/browse/LUCENE-1068
 public StandardAnalyzer(String[] stopwords,
    boolean replaceInvalidAcronym) throws IOException 
    Parameters:
    stopwords - The stopwords to use
    replaceInvalidAcronym - Set to true if this analyzer should replace mischaracterized acronyms in the StandardTokenizer See https://issues.apache.org/jira/browse/LUCENE-1068
 public StandardAnalyzer(Set stopwords,
    boolean replaceInvalidAcronym) throws IOException 
    Parameters:
    stopwords - The stopwords to use
    replaceInvalidAcronym - Set to true if this analyzer should replace mischaracterized acronyms in the StandardTokenizer See https://issues.apache.org/jira/browse/LUCENE-1068
Method from org.apache.lucene.analysis.standard.StandardAnalyzer Summary:
getDefaultReplaceInvalidAcronym,   getMaxTokenLength,   isReplaceInvalidAcronym,   reusableTokenStream,   setDefaultReplaceInvalidAcronym,   setMaxTokenLength,   setReplaceInvalidAcronym,   tokenStream
Methods from org.apache.lucene.analysis.Analyzer:
close,   getPositionIncrementGap,   getPreviousTokenStream,   reusableTokenStream,   setPreviousTokenStream,   tokenStream
Methods from java.lang.Object:
equals,   getClass,   hashCode,   notify,   notifyAll,   toString,   wait,   wait,   wait
Method from org.apache.lucene.analysis.standard.StandardAnalyzer Detail:
 public static boolean getDefaultReplaceInvalidAcronym() 
Deprecated! This - will be removed (hardwired to true) in 3.0

 public int getMaxTokenLength() 
 public boolean isReplaceInvalidAcronym() 
Deprecated! This - will be removed (hardwired to true) in 3.0

 public TokenStream reusableTokenStream(String fieldName,
    Reader reader) throws IOException 
 public static  void setDefaultReplaceInvalidAcronym(boolean replaceInvalidAcronym) 
Deprecated! This - will be removed (hardwired to true) in 3.0

 public  void setMaxTokenLength(int length) 
    Set maximum allowed token length. If a token is seen that exceeds this length then it is discarded. This setting only takes effect the next time tokenStream or reusableTokenStream is called.
 public  void setReplaceInvalidAcronym(boolean replaceInvalidAcronym) 
Deprecated! This - will be removed (hardwired to true) in 3.0

 public TokenStream tokenStream(String fieldName,
    Reader reader)