java - Using CharFilter with Lucene 4.3.0's StandardAnalyzer - Stack Overflow
The intent is for you to override
The intent is for you to override
Analyzer, rather than StandardAnalyzer. The thinking is that you should never subclass an Analyzer implementation (some discussion of there here). Analyzer implementations are pretty straightforward though, and adding a CharFilter to an Analyzer implementing the same tokenizer/filter chain as StandardAnalyzer would look something like:
The intent is for you to override
Analyzer, rather than StandardAnalyzer. The thinking is that you should never subclass an Analyzer implementation (some discussion of there here). Analyzer implementations are pretty straightforward though, and adding a CharFilter to an Analyzer implementing the same tokenizer/filter chain as StandardAnalyzer would look something like:public final class MyAnalyzer {
@Override
protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
final StandardTokenizer src = new StandardTokenizer(matchVersion, reader);
TokenStream tok = new StandardFilter(matchVersion, src);
tok = new LowerCaseFilter(matchVersion, tok);
tok = new StopFilter(matchVersion, tok, StopAnalyzer.ENGLISH_STOP_WORDS_SET);
return new TokenStreamComponents(src, tok);
}
@Override
protected Reader initReader(String fieldName, Reader reader) {
//return your CharFilter-wrapped reader here
}
}
Read full article from java - Using CharFilter with Lucene 4.3.0's StandardAnalyzer - Stack Overflow
No comments:
Post a Comment