Improve the overall speed and memory use of libCellML#1369
Open
agarny wants to merge 5 commits intocellml:mainfrom
Open
Improve the overall speed and memory use of libCellML#1369agarny wants to merge 5 commits intocellml:mainfrom
agarny wants to merge 5 commits intocellml:mainfrom
Conversation
Note that the test is disabled since it performs repeated parse/analyse/generate loops, which take several minutes to complete. So, it should be enabled only when needed, and not as part of our regular test suite.
... rather than the generic areEquivalentVariables() method which doesn't include caching.
Indeed, in our WebAssembly module, the key was 32-bit (while 64-bit in C++/Python) which caused collisions and incorrect cache hits. So, we replaced the Cantor-pairing key and std::map-based cache with a typed VariableKeyPair and std::unordered_map. This makes the cached-equivalence lookup more explicit, portable, and efficient.
This enforces an ordered, deterministic container for identifier lists and consolidates the type via the IdList typedef.
There was a problem hiding this comment.
Pull request overview
This PR focuses on performance/memory optimisations across libCellML (per #1368), largely by reducing copies, replacing linear searches with hash-based lookups, and removing some regex usage in hot paths.
Changes:
- Reduce copying by returning
const std::string &/const std::vector<...> &from many getters and by tightening lambda captures. - Improve algorithmic complexity in several areas (e.g.,
unordered_set/unordered_mapfor uniqueness checks/caches; pre-reserving vectors/strings; caching equivalence lookups). - Replace some regex-based string processing with simpler/faster implementations (XML error formatting, Printer math normalisation, Generator version string handling).
Reviewed changes
Copilot reviewed 54 out of 55 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/generator/generator.cpp | Adds a large-model repeated-run test block (currently commented out). |
| tests/bindings/python/test_generator.py | Adds a very-large-model test block but disables it via a triple-quoted string literal. |
| tests/bindings/javascript/generator.test.js | Adds a very-large-model test block (currently commented out) including manual delete calls. |
| src/xmlnode.h | Adds XmlNode::rawName() accessor for faster element-name checks. |
| src/xmlnode.cpp | Optimises element/attribute checks; implements rawName(); simplifies attribute access and element namespace comparison. |
| src/xmldoc.cpp | Replaces regex-based newline normalisation with std::replace. |
| src/xmlattribute.cpp | Speeds up namespace comparisons by using libxml namespace pointers directly and handling null namespaces explicitly. |
| src/variable.cpp | Minor capture/perf tweaks; changes some getters to return const std::string &. |
| src/validator.cpp | Reworks many uniqueness checks to hash sets; optimises MathML element support checks; refactors unit-equivalence logic. |
| src/utilities.h | Changes analyserVariables(AnalyserVariablePtr) to return a const reference. |
| src/utilities.cpp | Adds reserve() in helpers; switches some operations to lower-overhead implementations; updates analyser-variable collection helpers. |
| src/units.cpp | Minor lambda capture improvements; avoids extra copies in unitAttributes and setters. |
| src/reset.cpp | Changes string getters to return const std::string &. |
| src/printer.cpp | Removes regex use; optimises connection serialisation and whitespace cleanup; reduces repeated string copies; dedupes imports more efficiently. |
| src/parser.cpp | Avoids unnecessary copies when iterating namespace maps. |
| src/namedentity.cpp | Changes name() to return const std::string &. |
| src/model.cpp | Minor lambda capture improvements; uses a set to avoid duplicate import requirement URLs. |
| src/issue.cpp | Changes description() to return const std::string &. |
| src/internaltypes.h | Switches several internal container typedefs to unordered variants (and changes IdList type). |
| src/importsource.cpp | Changes url() to return const std::string &. |
| src/importer.cpp | Uses find/emplace patterns for maps/sets; reduces repeated lookups/copies; optimises name-collision checks. |
| src/importedentity.cpp | Changes importReference() to return const std::string &. |
| src/generatorvariabletracker.cpp | Reduces temporary vectors and reserves before inserting. |
| src/generatorprofile.cpp | Changes many profile string getters to return const std::string &. |
| src/generator.cpp | Reduces string churn via addCode, reserve(), fewer temporaries; replaces regex semver tweak with manual scan; uses hash sets for equation bookkeeping. |
| src/generator_p.h | Updates generator internals to use unordered_set parameters and replaces newline helper with addCode. |
| src/entity.cpp | Changes id() to return const std::string &. |
| src/componententity.cpp | Minor lambda capture improvements; changes encapsulationId() to return const std::string &. |
| src/component.cpp | Minor lambda capture improvements; changes math() to return const std::string &. |
| src/commonutils.cpp | Simplifies owningModel() traversal to walk parents directly. |
| src/CMakeLists.txt | Adds CODE_COVERAGE_ENABLED compile definition when coverage is enabled. |
| src/api/libcellml/variable.h | Updates public API getters to return const std::string & for initial value and interface type. |
| src/api/libcellml/reset.h | Updates public API reset getters to return const std::string &. |
| src/api/libcellml/namedentity.h | Updates public API name() to return const std::string &. |
| src/api/libcellml/issue.h | Updates public API description() to return const std::string &. |
| src/api/libcellml/importsource.h | Updates public API url() to return const std::string &. |
| src/api/libcellml/importedentity.h | Updates public API importReference() to return const std::string &. |
| src/api/libcellml/entity.h | Updates public API id() to return const std::string &. |
| src/api/libcellml/componententity.h | Updates public API encapsulationId() to return const std::string &. |
| src/api/libcellml/component.h | Updates public API math() to return const std::string &. |
| src/api/libcellml/analysermodel.h | Updates public API analyser-model vector getters to return const std::vector<...> &. |
| src/api/libcellml/analyserexternalvariable.h | Updates public API dependencies() to return const std::vector<...> &. |
| src/api/libcellml/analyserequationast.h | Updates public API value() to return const std::string &. |
| src/api/libcellml/analyserequation.h | Updates public API analyser-equation vector getters to return const std::vector<...> &. |
| src/annotator.cpp | Avoids repeated owningComponent() calls; fixes switch to use already-computed type. |
| src/analyservariable.cpp | Reserves result size when expanding weak equation references. |
| src/analysermodel.cpp | Returns vectors by const ref with static empties on invalid; adds caches for analyser-variable lookup and variable equivalence results. |
| src/analysermodel_p.h | Replaces map-based cache with unordered_map keyed by pointer-pairs + hash; adds analyser-variable cache. |
| src/analyserexternalvariable.cpp | Tightens lambda captures; returns dependencies by const reference. |
| src/analyserequationast.cpp | Returns AST value by const reference. |
| src/analyserequation.cpp | Reserves when expanding weak refs; returns several vectors by const reference. |
| src/analyser.cpp | Adds internal pointer caches/sets; swaps MathML element checks to rawName()+strcmp path for speed; reduces duplicate scanning in multiple places. |
| src/analyser_p.h | Adds set/cache members and switches several internal maps to unordered variants. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #1368.