Removes stop words from a token stream.
| Constructor: |
public StopFilter(TokenStream input,
String[] stopWords) {
this(input, stopWords, false);
}
Construct a token stream filtering the given input. |
public StopFilter(TokenStream in,
Set stopWords) {
this(in, stopWords, false);
}
Constructs a filter which removes words from the input
TokenStream that are named in the Set. |
public StopFilter(TokenStream in,
String[] stopWords,
boolean ignoreCase) {
super(in);
this.stopWords = (CharArraySet)makeStopSet(stopWords, ignoreCase);
}
Constructs a filter which removes words from the input
TokenStream that are named in the array of words. |
public StopFilter(TokenStream input,
Set stopWords,
boolean ignoreCase) {
super(input);
if (stopWords instanceof CharArraySet) {
this.stopWords = (CharArraySet)stopWords;
} else {
this.stopWords = new CharArraySet(stopWords.size(), ignoreCase);
this.stopWords.addAll(stopWords);
}
}
Construct a token stream filtering the given input.
If stopWords is an instance of CharArraySet (true if
makeStopSet() was used to construct the set) it will be directly used
and ignoreCase will be ignored since CharArraySet
directly controls case sensitivity.
If stopWords is not an instance of CharArraySet ,
a new CharArraySet will be constructed and ignoreCase will be
used to specify the case sensitivity of that set. Parameters:
input -
stopWords - The set of Stop Words.
ignoreCase - -Ignore case when stopping.
|