java - Using CharFilter with Lucene 4.3.0's StandardAnalyzer - Stack Overflow
The intent is for you to override
The intent is for you to override
Analyzer
, rather than StandardAnalyzer
. The thinking is that you should never subclass an Analyzer implementation (some discussion of there here). Analyzer implementations are pretty straightforward though, and adding a CharFilter to an Analyzer implementing the same tokenizer/filter chain as StandardAnalyzer would look something like:
The intent is for you to override
Analyzer
, rather than StandardAnalyzer
. The thinking is that you should never subclass an Analyzer implementation (some discussion of there here). Analyzer implementations are pretty straightforward though, and adding a CharFilter to an Analyzer implementing the same tokenizer/filter chain as StandardAnalyzer would look something like:public final class MyAnalyzer {
@Override
protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
final StandardTokenizer src = new StandardTokenizer(matchVersion, reader);
TokenStream tok = new StandardFilter(matchVersion, src);
tok = new LowerCaseFilter(matchVersion, tok);
tok = new StopFilter(matchVersion, tok, StopAnalyzer.ENGLISH_STOP_WORDS_SET);
return new TokenStreamComponents(src, tok);
}
@Override
protected Reader initReader(String fieldName, Reader reader) {
//return your CharFilter-wrapped reader here
}
}
Read full article from java - Using CharFilter with Lucene 4.3.0's StandardAnalyzer - Stack Overflow
No comments:
Post a Comment