Increased validation and serialisation performance by NaqGuug · Pull Request #145 · python-scim/scim2-models

NaqGuug · 2026-05-21T19:22:18Z

Performance improvements

Caching

The main reason why model validation/serialisation used to be very slow was unnecessarily computing same information during validation/serialisation. This is why we cache commonly used metadata of fields to a class variable when creating a model. Just normalising attribute names alone ate all of the processing time, which is not that surprising because we are calling a regex pattern millions of times.

I'm actually thinking if the name normalisation is actually needed, as RFC 7643 §2.1 is quite lenient about accepted attribute names. Just calling lower() could be enough, as "user-name", "user_name" and "username" are considered different. Of course we want to convert Python snake_case variables to camelCase automatically, so we need to think what kind of normalisation is needed.

Validation

Every context validators were it's own validators and each validator ran for ALL fields. This is simply waste of time, so I just collapsed the whole context validation to one model validator. Here the cached values are also used, which works quite nicely.

Serialisation

Same story for serialisation, collapsed whole serialisation process to single model serialiser. Also completely removed model_serializer_exclude_none as we want Pydantic to exclude the None fields for us. In the new serialiser we only check the deletion for specific fields which could be None after Pydantic's exclusion. Also as mention before, here caching really comes to play.

Fixes/Misc

One fix regarding for checking replace constraints of extensions. Previously extensions were skipped for this check, so made some tests and fixes to the code. Now when calling replace() we recursively check both complex attributes and extensions.

Overall I refactored and simplified whole base.py. There are still improvements left, mainly caching values from get_field_annotation, get_field_root_type and get_field_multiplicity, as those are completely static metadata and calling these functions are surprisingly expensive. Also we could cache immutable fields, always returned fields, never returned fields etc. so during validation/serialisation we never have to loop through all the fields, just the ones that actually matter. However, I didn't include these in this PR, as there is already much to review through.

Script used for performance checking

import json
from pyinstrument import Profiler
from scim2_models import User, Context

REPETATIONS = 10000


def main():
    with open("rfc7643-8.2-user-full.json", "r") as user_file:
        user_dict = json.load(user_file)
    scim_user_dict = User.model_validate(user_dict)

    scim2_profiler = Profiler()
    scim2_profiler.start()

    context = Context.DEFAULT
    # context = Context.RESOURCE_CREATION_REQUEST
    # context = Context.RESOURCE_QUERY_RESPONSE
    for _ in range(REPETATIONS):
        User.model_validate(
            scim_user_dict.model_dump(
                scim_ctx=context
            ),
            scim_ctx=context
        )

    scim2_profiler.stop()
    scim2_profiler.write_html("scim2-models.html")


if __name__ == "__main__":
    main()

Results

DEFAULT

Speedup: ~4x

Before	After

CREATION REQUEST

Speedup: ~2x

Before	After

QUERY RESPONSE

Speedup: ~2x

Before	After

Updated attribute urns get/set

Cache normalized names with lru_cache

This allows us to delete the dict comprehension from `scim_serializer` and pydantic's none exclusion preserved

Mainly removed unused checks

Simplified `_set_complex_attribute_urns` even more

Test model serialization and validation with extensions

Added `extensions` to lookup table. In `_apply_replace_constraints` we just loop through complex attributes and extensions for deep replace check

azmeuk · 2026-05-22T21:00:18Z

Hello. Thank you for your contributions. I am quite busy currently but I will try to review your patches in the coming weeks.

NaqGuug added 13 commits May 12, 2026 19:38

perf: Cache common values for BaseModel

ca41993

Updated attribute urns get/set

perf: Simplify normalize_attribute_names

fc1c21f

Cache normalized names with lru_cache

perf: Collapse all context validators to one

1dd169e

perf: Collapse serialization to single one

13fae66

perf: Move model_dump to BaseModel

1849598

This allows us to delete the dict comprehension from `scim_serializer` and pydantic's none exclusion preserved

fix: Full test coverage

f6a2451

Mainly removed unused checks

fix: Pass formatting

a648833

Simplified `_set_complex_attribute_urns` even more

tests: Serialization and validation tests

767c924

Test model serialization and validation with extensions

fix: Exclude extension if all fields are None

24ca797

fix: Check extension replace constraints

2fabee8

Added `extensions` to lookup table. In `_apply_replace_constraints` we just loop through complex attributes and extensions for deep replace check

refactor: Use extension map in scim_serializer

cc17651

chore: Pass code style

e934f82

docs: Updated changelog

d215968

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Increased validation and serialisation performance#145

Increased validation and serialisation performance#145
NaqGuug wants to merge 13 commits into
python-scim:mainfrom
NaqGuug:perf/validation-and-serialization-performance

NaqGuug commented May 21, 2026 •

edited

Loading

Uh oh!

azmeuk commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

NaqGuug commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Performance improvements

Caching

Validation

Serialisation

Fixes/Misc

Script used for performance checking

Results

DEFAULT

Speedup: ~4x

CREATION REQUEST

Speedup: ~2x

QUERY RESPONSE

Speedup: ~2x

Uh oh!

azmeuk commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

NaqGuug commented May 21, 2026 •

edited

Loading