Skip to content

Optimize package metadata query in simple page retrieval #1242

@dkliban

Description

@dkliban

Problem

The package metadata query in the PyPI Simple API retrieve method is a significant contributor to database load in production. During peak load on 2026-05-20 at 16:00 UTC, 4 concurrent instances of this query pattern consumed 2.73 avg active sessions on the RDS instance (db.m7g.4xlarge, 16 vCPUs). Total DB load reached 21.9 avg active sessions, exceeding the CPU count.

Source

pulp_python/app/pypi/views.py lines 376-386:

packages = local_packages.annotate(
    repo_added_time=Subquery(repo_added_subquery)
).values(
    "filename",
    "sha256",
    "metadata_sha256",
    "requires_python",
    "size",
    "repo_added_time",
    "version",
)

SQL pattern observed (from RDS Performance Insights)

SELECT "python_pythonpackagecontent"."filename" AS "filename",
       "python_pythonpackagecontent"."sha256" AS "sha256",
       "python_pythonpackagecontent"."metadata_sha256" AS "metadata_sha256",
       "python_pythonpackagecontent"."requires_python" AS "requires_python",
       "python_pythonpackagecontent"."size" AS "size",
       "core_content"."pulp_created" AS "pulp_created",
       "python_pythonpackagecontent"."version" AS "version"
FROM "python_pythonpackagecontent"
INNER JOIN "core_content" ON (...)

The wait event breakdown during the spike was primarily CPU (17.9 avg active sessions), indicating the queries are compute-bound rather than I/O-bound.

Suggestions

  • Add database indexes on commonly filtered/joined columns
  • Consider caching results for frequently-accessed packages
  • Evaluate whether the Subquery annotation for repo_added_time can be optimized (e.g., via a JOIN instead of a correlated subquery)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions