Skip to content

Improve the overall speed and memory use of libCellML#1369

Open
agarny wants to merge 5 commits intocellml:mainfrom
agarny:issue1368
Open

Improve the overall speed and memory use of libCellML#1369
agarny wants to merge 5 commits intocellml:mainfrom
agarny:issue1368

Conversation

@agarny
Copy link
Copy Markdown
Contributor

@agarny agarny commented Apr 12, 2026

Fixes #1368.

agarny added 4 commits April 9, 2026 13:54
Note that the test is disabled since it performs repeated parse/analyse/generate loops, which take several minutes to complete. So, it should be enabled only when needed, and not as part of our regular test suite.
... rather than the generic areEquivalentVariables() method which doesn't include caching.
Indeed, in our WebAssembly module, the key was 32-bit (while 64-bit in C++/Python) which caused collisions and incorrect cache hits.

So, we replaced the Cantor-pairing key and std::map-based cache with a typed VariableKeyPair and std::unordered_map. This makes the cached-equivalence lookup more explicit, portable, and efficient.
This enforces an ordered, deterministic container for identifier lists and consolidates the type via the IdList typedef.
Copilot AI review requested due to automatic review settings April 12, 2026 05:38
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR focuses on performance/memory optimisations across libCellML (per #1368), largely by reducing copies, replacing linear searches with hash-based lookups, and removing some regex usage in hot paths.

Changes:

  • Reduce copying by returning const std::string & / const std::vector<...> & from many getters and by tightening lambda captures.
  • Improve algorithmic complexity in several areas (e.g., unordered_set/unordered_map for uniqueness checks/caches; pre-reserving vectors/strings; caching equivalence lookups).
  • Replace some regex-based string processing with simpler/faster implementations (XML error formatting, Printer math normalisation, Generator version string handling).

Reviewed changes

Copilot reviewed 54 out of 55 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/generator/generator.cpp Adds a large-model repeated-run test block (currently commented out).
tests/bindings/python/test_generator.py Adds a very-large-model test block but disables it via a triple-quoted string literal.
tests/bindings/javascript/generator.test.js Adds a very-large-model test block (currently commented out) including manual delete calls.
src/xmlnode.h Adds XmlNode::rawName() accessor for faster element-name checks.
src/xmlnode.cpp Optimises element/attribute checks; implements rawName(); simplifies attribute access and element namespace comparison.
src/xmldoc.cpp Replaces regex-based newline normalisation with std::replace.
src/xmlattribute.cpp Speeds up namespace comparisons by using libxml namespace pointers directly and handling null namespaces explicitly.
src/variable.cpp Minor capture/perf tweaks; changes some getters to return const std::string &.
src/validator.cpp Reworks many uniqueness checks to hash sets; optimises MathML element support checks; refactors unit-equivalence logic.
src/utilities.h Changes analyserVariables(AnalyserVariablePtr) to return a const reference.
src/utilities.cpp Adds reserve() in helpers; switches some operations to lower-overhead implementations; updates analyser-variable collection helpers.
src/units.cpp Minor lambda capture improvements; avoids extra copies in unitAttributes and setters.
src/reset.cpp Changes string getters to return const std::string &.
src/printer.cpp Removes regex use; optimises connection serialisation and whitespace cleanup; reduces repeated string copies; dedupes imports more efficiently.
src/parser.cpp Avoids unnecessary copies when iterating namespace maps.
src/namedentity.cpp Changes name() to return const std::string &.
src/model.cpp Minor lambda capture improvements; uses a set to avoid duplicate import requirement URLs.
src/issue.cpp Changes description() to return const std::string &.
src/internaltypes.h Switches several internal container typedefs to unordered variants (and changes IdList type).
src/importsource.cpp Changes url() to return const std::string &.
src/importer.cpp Uses find/emplace patterns for maps/sets; reduces repeated lookups/copies; optimises name-collision checks.
src/importedentity.cpp Changes importReference() to return const std::string &.
src/generatorvariabletracker.cpp Reduces temporary vectors and reserves before inserting.
src/generatorprofile.cpp Changes many profile string getters to return const std::string &.
src/generator.cpp Reduces string churn via addCode, reserve(), fewer temporaries; replaces regex semver tweak with manual scan; uses hash sets for equation bookkeeping.
src/generator_p.h Updates generator internals to use unordered_set parameters and replaces newline helper with addCode.
src/entity.cpp Changes id() to return const std::string &.
src/componententity.cpp Minor lambda capture improvements; changes encapsulationId() to return const std::string &.
src/component.cpp Minor lambda capture improvements; changes math() to return const std::string &.
src/commonutils.cpp Simplifies owningModel() traversal to walk parents directly.
src/CMakeLists.txt Adds CODE_COVERAGE_ENABLED compile definition when coverage is enabled.
src/api/libcellml/variable.h Updates public API getters to return const std::string & for initial value and interface type.
src/api/libcellml/reset.h Updates public API reset getters to return const std::string &.
src/api/libcellml/namedentity.h Updates public API name() to return const std::string &.
src/api/libcellml/issue.h Updates public API description() to return const std::string &.
src/api/libcellml/importsource.h Updates public API url() to return const std::string &.
src/api/libcellml/importedentity.h Updates public API importReference() to return const std::string &.
src/api/libcellml/entity.h Updates public API id() to return const std::string &.
src/api/libcellml/componententity.h Updates public API encapsulationId() to return const std::string &.
src/api/libcellml/component.h Updates public API math() to return const std::string &.
src/api/libcellml/analysermodel.h Updates public API analyser-model vector getters to return const std::vector<...> &.
src/api/libcellml/analyserexternalvariable.h Updates public API dependencies() to return const std::vector<...> &.
src/api/libcellml/analyserequationast.h Updates public API value() to return const std::string &.
src/api/libcellml/analyserequation.h Updates public API analyser-equation vector getters to return const std::vector<...> &.
src/annotator.cpp Avoids repeated owningComponent() calls; fixes switch to use already-computed type.
src/analyservariable.cpp Reserves result size when expanding weak equation references.
src/analysermodel.cpp Returns vectors by const ref with static empties on invalid; adds caches for analyser-variable lookup and variable equivalence results.
src/analysermodel_p.h Replaces map-based cache with unordered_map keyed by pointer-pairs + hash; adds analyser-variable cache.
src/analyserexternalvariable.cpp Tightens lambda captures; returns dependencies by const reference.
src/analyserequationast.cpp Returns AST value by const reference.
src/analyserequation.cpp Reserves when expanding weak refs; returns several vectors by const reference.
src/analyser.cpp Adds internal pointer caches/sets; swaps MathML element checks to rawName()+strcmp path for speed; reduces duplicate scanning in multiple places.
src/analyser_p.h Adds set/cache members and switches several internal maps to unordered variants.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve the overall speed and memory use of libCellML

2 participants