All About Programming: [SOLR-8619] A new replica should not become leader when all current replicas are down as it leads to data loss

[SOLR-8619] A new replica should not become leader when all current replicas are down as it leads to data loss - ASF JIRA

Here's what I'm talking about:

Start a 2 node solrcloud cluster
Create a 1 shard/1 replica collection
Add documents
Shut down the node that has the only active shard
ADDREPLICA for the shard/collection, so Solr would attempt to add a new replica on the other node
Solr waits for a while before this replica becomes an active leader.
Index a few new docs
Bring up the old node
The replica comes up, with it's old index and then syncs to only contain the docs from the new leader.
All old documents are lost in this case

Here are a few things that might work here:
1. Reject an ADDREPLICA call if all current replicas for the shard are down. Considering the new replica can not sync from anyone, it doesn't make sense for this replica to even come up
2. The replica shouldn't become active/leader unless either it was the last known leader or active before it went into recovering state
unless there are no other replicas in the clusterstate.

Read full article from [SOLR-8619] A new replica should not become leader when all current replicas are down as it leads to data loss - ASF JIRA

[SOLR-8619] A new replica should not become leader when all current replicas are down as it leads to data loss - ASF JIRA

No comments:

Post a Comment

Labels

Popular Posts