What is the best way to use Apache Solr for relational JOIN? - Quora
Why not run a single-use script that does the joins on the SOLR index to replace the authorID with the author name, or add the field if you need both? Updating 10,000,000+ documents will take a while, but it's certainly doable. Then you can have new documents indexed with both fields as well, doing the JOIN one at a time.In general, SOLR works best with completely denormalized data.
EDIT:
Okay, if it's more than just authors' names, you could still denormalize the data, repeating your documents for each of the related author documents. However, that's an ugly solution, and SOLR 4 does have "JOIN" capability. You'd first need to at least have a pipeline where author data is indexed into SOLR in real-time. Then yes, I'd use ElasticSearch to easily handle the complex SOLR queries involving JOIN-like operations.
Read full article from What is the best way to use Apache Solr for relational JOIN? - Quora
No comments:
Post a Comment