All About Programming: MoreLikeThis - Apache Solr Reference Guide

The MoreLikeThis search component enables users to query for documents similar to a document in their result list. It does this by using terms from the original document to find similar documents in the index.

How MoreLikeThis Works

MoreLikeThis constructs a Lucene query based on terms in a document. It does this by pulling terms from the defined list of fields ( see the mlt.fl parameter, below). For best results, the fields should have stored term vectors in schema.xml

<field name="cat" ... termVectors="true" />

If term vectors are not stored, MoreLikeThis will generate terms from stored fields. A uniqueKey must also be stored in order for MoreLikeThis to work properly.

The next phase filters terms from the original document using thresholds defined with the MoreLikeThis parameters. Finally, a query is run with these terms, and any other query parameters that have been defined (see the mlt.qf parameter, below) and a new document set is returned.

mlt.count
mlt.fl
mlt.minwl
mlt.qf
mlt.mintf
Specifies the Minimum Term Frequency, the frequency below which terms will be ignored in the source document.
mlt.mindf
Specifies the Minimum Document Frequency, the frequency at which words will be ignored which do not occur in at least this many documents.

http://localhost:8080/solr/select/?qt=mlt&q=id:[document id]&mlt.fl=[field1],[field2],[field3]&fl=id&rows=10
Read full article from MoreLikeThis - Apache Solr Reference Guide - Apache Software Foundation

MoreLikeThis - Apache Solr Reference Guide - Apache Software Foundation

No comments:

Post a Comment

Labels

Popular Posts