{Core} Use multi-threading to build command index to improve performance#32730
{Core} Use multi-threading to build command index to improve performance#32730DanielMicrosoft wants to merge 3 commits intoAzure:devfrom
Conversation
️✔️AzureCLI-FullTest
|
|
Thank you for your contribution! We will review the pull request and get back to you soon. |
️✔️AzureCLI-BreakingChangeTest
|
|
The git hooks are available for azure-cli and azure-cli-extensions repos. They could help you run required checks before creating the PR. Please sync the latest code with latest dev branch (for azure-cli) or main branch (for azure-cli-extensions). pip install azdev --upgrade
azdev setup -c <your azure-cli repo path> -r <your azure-cli-extensions repo path>
|
There was a problem hiding this comment.
Pull request overview
This PR updates the core command-loading pipeline to load command modules in parallel, aiming to reduce command index rebuild time while preserving command table semantics.
Changes:
- Introduces a threaded module-loading mechanism in
MainCommandsLoader(azure.cli.core.__init__), along withModuleLoadResult, timeout/worker-count configuration, and refactored error-handling and telemetry for module load failures. - Refactors
_load_command_loader/_load_module_command_loader/_load_extension_command_loaderinazure.cli.core.commands.__init__to return thecommand_loaderobject in addition to command and group tables, moving population ofloadersandcmd_to_loader_mapintoMainCommandsLoader. - Adds and updates tests to validate command table integrity, command-to-loader mapping after parallel loading, and to make existing expectations order-insensitive.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
src/azure-cli-core/azure/cli/core/__init__.py |
Adds threaded module loading (_load_modules, _load_single_module) and ModuleLoadResult, refactors load_command_table to use pre-loaded results, maintains cmd_to_loader_map and command_source, and introduces timeout/worker-count constants and top-level completion helpers. |
src/azure-cli-core/azure/cli/core/commands/__init__.py |
Changes _load_command_loader and wrappers to return (command_table, group_table, command_loader), deferring loaders/cmd_to_loader_map updates to MainCommandsLoader. |
src/azure-cli-core/azure/cli/core/tests/test_command_registration.py |
Updates the mocked _load_command_loader to the new 3-tuple signature, adds test_cmd_to_loader_map_populated_after_parallel_loading, and relaxes some assertions to be order-insensitive to accommodate parallel loading. |
src/azure-cli-core/azure/cli/core/tests/test_parser.py |
Updates the local _mock_load_command_loader used for parser tests to return the new 3-tuple, keeping it consistent with the refactored core loader API. |
src/azure-cli-core/azure/cli/core/tests/test_command_table_integrity.py |
Adds a new end-to-end test verifying there are no duplicate commands, core command groups exist, every command has a command_source, and that command and group tables are non-empty after loading via the new pipeline. |
In addition to the inline comment already recorded, note that changing _load_command_loader to return a 3-tuple will break any remaining callers that still unpack only two values (for example, test_help.mock_load_command_loader currently does module_command_table, module_group_table = mock_load_command_loader(...)), so those sites need to be updated to handle the new (command_table, group_table, command_loader) contract.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for future in concurrent.futures.as_completed(future_to_module): | ||
| try: | ||
| result = future.result(timeout=MODULE_LOAD_TIMEOUT_SECONDS) | ||
| results.append(result) | ||
| except concurrent.futures.TimeoutError: | ||
| mod = future_to_module[future] | ||
| logger.warning("Module '%s' load timeout after %s seconds", mod, MODULE_LOAD_TIMEOUT_SECONDS) | ||
| results.append(ModuleLoadResult(mod, {}, {}, 0, | ||
| Exception(f"Module '{mod}' load timeout"))) | ||
| except (ImportError, AttributeError, TypeError, ValueError) as ex: | ||
| mod = future_to_module[future] | ||
| logger.warning("Module '%s' load failed: %s", mod, ex) | ||
| results.append(ModuleLoadResult(mod, {}, {}, 0, ex)) | ||
| except Exception as ex: # pylint: disable=broad-exception-caught | ||
| mod = future_to_module[future] | ||
| logger.warning("Module '%s' load failed with unexpected exception: %s", mod, ex) | ||
| results.append(ModuleLoadResult(mod, {}, {}, 0, ex)) | ||
|
|
There was a problem hiding this comment.
MODULE_LOAD_TIMEOUT_SECONDS is currently ineffective because future.result(timeout=MODULE_LOAD_TIMEOUT_SECONDS) is only called on futures yielded by concurrent.futures.as_completed, which guarantees the future is already complete. That means a hung module load will block inside as_completed indefinitely and never raise TimeoutError, contrary to the timeout comment and constant. To actually enforce a per-module timeout, the timeout needs to be applied at the as_completed/wait level (or by starting a timed wait and then cancelling unfinished futures), rather than on future.result() for already-completed futures.
| for future in concurrent.futures.as_completed(future_to_module): | |
| try: | |
| result = future.result(timeout=MODULE_LOAD_TIMEOUT_SECONDS) | |
| results.append(result) | |
| except concurrent.futures.TimeoutError: | |
| mod = future_to_module[future] | |
| logger.warning("Module '%s' load timeout after %s seconds", mod, MODULE_LOAD_TIMEOUT_SECONDS) | |
| results.append(ModuleLoadResult(mod, {}, {}, 0, | |
| Exception(f"Module '{mod}' load timeout"))) | |
| except (ImportError, AttributeError, TypeError, ValueError) as ex: | |
| mod = future_to_module[future] | |
| logger.warning("Module '%s' load failed: %s", mod, ex) | |
| results.append(ModuleLoadResult(mod, {}, {}, 0, ex)) | |
| except Exception as ex: # pylint: disable=broad-exception-caught | |
| mod = future_to_module[future] | |
| logger.warning("Module '%s' load failed with unexpected exception: %s", mod, ex) | |
| results.append(ModuleLoadResult(mod, {}, {}, 0, ex)) | |
| # Wait for all module loads up to the configured timeout. | |
| done, not_done = concurrent.futures.wait( | |
| future_to_module.keys(), | |
| timeout=MODULE_LOAD_TIMEOUT_SECONDS | |
| ) | |
| # Process all futures that completed within the timeout. | |
| for future in done: | |
| mod = future_to_module[future] | |
| try: | |
| result = future.result() | |
| results.append(result) | |
| except (ImportError, AttributeError, TypeError, ValueError) as ex: | |
| logger.warning("Module '%s' load failed: %s", mod, ex) | |
| results.append(ModuleLoadResult(mod, {}, {}, 0, ex)) | |
| except Exception as ex: # pylint: disable=broad-exception-caught | |
| logger.warning("Module '%s' load failed with unexpected exception: %s", mod, ex) | |
| results.append(ModuleLoadResult(mod, {}, {}, 0, ex)) | |
| # Any futures still not done after the timeout are treated as timed out. | |
| for future in not_done: | |
| mod = future_to_module[future] | |
| # Best effort to stop further work on this module load. | |
| future.cancel() | |
| logger.warning("Module '%s' load timeout after %s seconds", mod, MODULE_LOAD_TIMEOUT_SECONDS) | |
| results.append( | |
| ModuleLoadResult( | |
| mod, | |
| {}, | |
| {}, | |
| 0, | |
| Exception(f"Module '{mod}' load timeout") | |
| ) | |
| ) |
Related command
Description
Version 2 of the previously reverted attempt to thread module loading, when building the command index:
#32518
More info on situations that trigger a build here.
A look at the speed improvements from the changes in this PR here.
Testing Guide
History Notes
[Component Name 1] BREAKING CHANGE:
az command a: Make some customer-facing breaking change[Component Name 2]
az command b: Add some customer-facing featureThis checklist is used to make sure that common guidelines for a pull request are followed.
The PR title and description has followed the guideline in Submitting Pull Requests.
I adhere to the Command Guidelines.
I adhere to the Error Handling Guidelines.