Skip to content

Preprocess is serialized + re-runs full table on every intercept (slow under load) #41

@aural-psynapse

Description

@aural-psynapse

When multiple queries hit the same table at once, each one kicks a full-table preprocess, and a process-wide lock (_preprocess.py_preprocess_after_insert_lock, run_preprocess polls every 2s) forces them to run one-at-a-time on that table.

So concurrent preprocesses on the same table queue single-file. Measured ~2s for one query vs ~30s for the 5th in a concurrent batch on the same table.

Fix ideas: don't re-preprocess the whole table on every intercept, and/or allow concurrent preprocess on the same table instead of serializing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions