fix: fall back to Cores API for delete-collection on standalone Solr (#33)#37
fix: fall back to Cores API for delete-collection on standalone Solr (#33)#37MohammadYusif wants to merge 1 commit into
Conversation
…alone Solr (apache#33) Standalone (non-SolrCloud) Solr returns HTTP 500 with "Aliases don't exist in a non-cloud context" when the Collections API's DELETE endpoint tries to inspect ZooKeeper-managed aliases. SolrAdminClient.delete_collection() now detects this response and transparently falls back to the V1 Cores API (UNLOAD action with deleteIndex/deleteDataDir/deleteInstanceDir=true), which works in both standalone and SolrCloud deployments. A new private helper _delete_core() encapsulates the UNLOAD call and maps 404/400-not-found responses to CollectionNotFoundError for consistency with the existing error contract. Tests added: - test_delete_collection_400_could_not_find: verifies 400 path raises CollectionNotFoundError - test_delete_collection_standalone_falls_back_to_cores_api: verifies the fallback - test_delete_collection_standalone_cores_api_not_found: verifies error propagation in fallback - test_delete_collection_other_500_raises: verifies non-cloud-context 500s still raise
10ffada to
f21f3a0
Compare
|
Thanks for this. No question the port so far is SolrCloud focused. Before we commit a fix, perhaps we should have a discussion about whether and how to make sure all workloads can run on both cloud and user-managed. Some Solr features and APIs only work with cloud. Should each workload declare whether it is compatible with standalone? Or do we have auto fallback solutions for every mismatch? a workload may specify a collection with 2 shards and 2 replicas. What to do on standalone mode? Better to exit with an error? A workload may benchmark backup/restore, which have different apis… Another workload may benchmark splitshard… |
|
Client-layer primitives — Workload-level features — multi-shard/replica topology, Concretely I'd suggest something like a So: keep this PR's fallback, it's appropriate here. But the broader policy should be: transparent fallback only where semantics are genuinely equivalent; declared incompatibility + early exit for everything else. That's unambiguous to workload authors and to users trying to understand why a run failed. |
|
Question: How can I reproduce this? Can I run one of the geonames/nyc-taxis workloads against a standalone local solr and see the same? I'd expect those to fall over during configset upload or collection creation, since those are fairly different? |
|
So let me ask a slightly provocative question... SHOULD WE SUPPORT STANDALONE? We already are suffering with this in the main Solr project. And in Solr 10, when you fire up a single node, it's in cloud anyway if you don't set special values. Plus, if you are testing a single node, well maybe other tools like our https://github.com/apache/solr/tree/main/solr/benchmark might be more useful. The niche I see that solr-orbit really hits is our Solr cloud macro benchmarking. My user base: cloud. kevin's: cloud. Jan's: My suspicion is that we'll spend a bunch of effort to test standalone for making the code work, but no one will actually use it for anything meaningful. Let's be realistic about our abilit to test things, and accept that this project requires solr cloud. |
|
I'm open to be pragmatic and document that Solr Orbit currently only supports Solr Cloud. In the future we could extend it if there is enough demand and someone willing to do the job. |
|
@MohammadYusif Again thanks for your contribution, but I'm afraid we focus on Cloud mode in the beginning, see #43 for a PR that documents this limitation, changes the quickstart example to start solr 9 in cloud mode, and also adds a detection to print an error if attempting to run with user-managed. This is not to disqualify your code in any way. And it may be that we re-visit this decision at a later time. If so, we'll do a thorough analysis of how to support user managed solr fully in a good way. Closing |
|
Likewise @MohammadYusif I was happy to see your PR pop up! Please do keep an eye on the tickets and I'd lvoe to review more PR's. |
Summary
delete-collectioncrashes with HTTP 500 ("Aliases don't exist in a non-cloud context") when targeting standalone Solr, because the Collections API unconditionally checks ZooKeeper-managed aliasesdelete_collection()now detects this specific 500 and transparently falls back to the V1 Cores API, which works in both standalone and SolrCloud deploymentsChanges
osbenchmark/client.py: added non-cloud detection indelete_collection()and a new_delete_core()helper calling/solr/admin/cores?action=UNLOADwith full cleanup flagstests/unit/solr/test_client.py: 4 new tests covering the fallback path, missing-core error propagation, the 400 path, and unrelated 500 errorsFixes #33