Skip to content

Dedupe individual tracks during import based on stable identifier#6723

Open
tommyschnabel wants to merge 5 commits into
beetbox:masterfrom
tommyschnabel:duplicates_import_actions
Open

Dedupe individual tracks during import based on stable identifier#6723
tommyschnabel wants to merge 5 commits into
beetbox:masterfrom
tommyschnabel:duplicates_import_actions

Conversation

@tommyschnabel

Copy link
Copy Markdown
Contributor

Description

This PR adds the duplicate_track_resolution (bool) and duplicate_track_action (skip, remove, keep/merge, ask) config options to import to enable deduplication of individual tracks when importing.

The recommended config for this is:

import:
    duplicate_track_resolution: yes
    duplicate_action: ask          # whole-album duplicates
    duplicate_track_action: skip   # per-track duplicates: fold new tracks into the existing album
    duplicate_keys:
        item: mb_trackid           # match on a stable id (recommended when autotagging)

With the above config, any duplicate imports of tracks will be dropped, and if the whole album has been imported before, the album import itself will be dropped. If part of an album is deduped, the new tracks will be folded into the existing album. This config allows for idempotency on any repeat beet import executions.

Copied from the new docs, here's what each option does:

skip drops the duplicate tracks and adds the remaining new tracks to the
existing album they belong to, instead of importing them as a separate album.
Use this to complete a partially-imported album. If every track is already
present, the whole album is skipped. (If the matching tracks do not all belong
to a single album, the new tracks are imported as their own album.)

  • remove removes the matching old items from the library before importing.
  • keep (and merge) import the album unchanged.
  • ask prompts you to choose one of the above.

The duplicate_keys config option is what's used to dedupe, so set it to something stable like mb_trackid.

In the previous PR, @snejus suggested adding this as an addition to the album deduplication that already exists. This implementation is separate from the album dedupe logic because the use-cases seemed incompatible.

The album dedupe skips/merges/removes based on an album already being imported, but the use-case I wanted was a deeper track-based mechanism. Bolting this onto the album dedupe produced some really weird results, and I wasn't able to achieve parity with my last impl with the available config options. This is a better, more flexible, approach.

To Do

  • Documentation. (If you've added a new command-line flag, for example, find the appropriate page under docs/ to describe it.)
  • Changelog. (Add an entry to docs/changelog.rst to the bottom of one of the lists near the top of the document.)
  • Tests. (Very much encouraged but not strictly required.)

tommyschnabel and others added 3 commits June 8, 2026 15:38
Move the import-time duplicate-track handling out of the `duplicates`
plugin and into importer core, next to the existing duplicate machinery.

Add the `import.duplicate_track_resolution` option (off by default). When
enabled, album imports check each track against the library using
`import.duplicate_keys.item` and resolve matches via the existing
`import.duplicate_action`:

- `skip` drops the already-imported tracks and imports the rest of the
  album (a fully-duplicate album is skipped);
- `remove` removes the matching old library items;
- `keep`/`merge` import everything unchanged;
- `ask` prompts the session.

The check runs before the autotag lookup so dropped tracks are excluded
from the match. The `duplicates` plugin is left untouched.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Match track duplicates against all library items (including album members,
not just singletons), so re-importing catches tracks already present in an
existing album.

When per-track resolution prunes some tracks of an album, suppress the
album-level duplicate check so the remaining new tracks are still imported
instead of the whole (partial) album being skipped.

Add a dedicated `duplicate_track_action` option (inheriting `duplicate_action`
when unset) with a new `fold` action that adds the remaining new tracks to the
existing album they belong to, completing a partially-imported album.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Collapse the separate `fold` track duplicate action into `skip`: skipping
per-track duplicates now folds the remaining new tracks into the existing
album they belong to (falling back to a new album when the matches do not
resolve to a single album). Removes the `fold` action, prompt option, and
`FOLD` resolution.

Show the per-track "Skipping duplicate track" message and the whole-album
"Skipping album, all tracks are duplicates" message in the warning color.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@tommyschnabel tommyschnabel requested a review from a team as a code owner June 8, 2026 21:31
@codecov

codecov Bot commented Jun 8, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 75.53191% with 23 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.63%. Comparing base (6f8d55d) to head (b79ac79).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
beets/ui/commands/import_/session.py 6.66% 14 Missing ⚠️
beets/importer/tasks.py 81.48% 2 Missing and 3 partials ⚠️
beets/importer/stages.py 92.00% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6723      +/-   ##
==========================================
- Coverage   75.63%   75.63%   -0.01%     
==========================================
  Files         162      162              
  Lines       20820    20903      +83     
  Branches     3298     3322      +24     
==========================================
+ Hits        15747    15809      +62     
- Misses       4292     4310      +18     
- Partials      781      784       +3     
Files with missing lines Coverage Δ
beets/importer/session.py 94.40% <100.00%> (+0.07%) ⬆️
beets/importer/stages.py 89.44% <92.00%> (+0.69%) ⬆️
beets/importer/tasks.py 90.70% <81.48%> (-0.26%) ⬇️
beets/ui/commands/import_/session.py 55.14% <6.66%> (-3.19%) ⬇️
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant