What are the best practices for combining analyzers in Lucene? - Stack Overflow
Lucene provides the
You can check out
Then, in YourAnalyzer, you'll chain StandardAnalyzer and SnowballAnalyzer by using the filters those analyzers use, like this:
Totally off-topic:
I suppose you'll eventually need to setup your custom way of handling requests. That is already implemented inside Solr.
First write your own Search Component by extending SearchComponent and defining it in SolrConfig.xml, like this:
Lucene provides the
org.apache.lucene.analysis.Analyzer
base class which can be used if you want to write your own Analyzer.You can check out
org.apache.lucene.analysis.standard.StandardAnalyzer
class that extends Analyzer.Then, in YourAnalyzer, you'll chain StandardAnalyzer and SnowballAnalyzer by using the filters those analyzers use, like this:
TokenStream result = new StandardFilter(tokenStream); result = new SnowballFilter(result, stopSet);
Then, in your existing code, you'll be able to construct IndexWriter with your own Analyzer implementation that chains Standard and Snowball filters. Totally off-topic:
I suppose you'll eventually need to setup your custom way of handling requests. That is already implemented inside Solr.
First write your own Search Component by extending SearchComponent and defining it in SolrConfig.xml, like this:
The SnowballAnalyzer provided by Lucene already uses the StandardTokenizer, StandardFilter, LowerCaseFilter, StopFilter, and SnowballFilter. So it sounds like it does exactly what you want (everything StandardAnalyzer does, plus the snowball stemming).
If it didn't, you could build your own analyzer pretty easily by combining whatever tokenizers and TokenStreams you wish.
Read full article from What are the best practices for combining analyzers in Lucene? - Stack Overflow
No comments:
Post a Comment