Skip to content

Fix missing type changes for equal-comparing list items (#605)#607

Open
Sanjays2402 wants to merge 1 commit into
qlustered:devfrom
Sanjays2402:fix/list-item-type-change
Open

Fix missing type changes for equal-comparing list items (#605)#607
Sanjays2402 wants to merge 1 commit into
qlustered:devfrom
Sanjays2402:fix/list-item-type-change

Conversation

@Sanjays2402

Copy link
Copy Markdown

Fixes #605

The bug

DeepDiff([2], [2.0]) reports no difference at all, even though the scalar and dict equivalents both correctly report an intfloat type change:

>>> from deepdiff import DeepDiff
>>> DeepDiff(2, 2.0)
{'type_changes': {'root': {'old_type': int, 'new_type': float, 'old_value': 2, 'new_value': 2.0}}}
>>> DeepDiff({'a': 2}, {'a': 2.0})
{'type_changes': {"root['a']": {'old_type': int, 'new_type': float, ...}}}
>>> DeepDiff([2], [2.0])      # ← inconsistent
{}

The same silent miss affects any equal-but-differently-typed pair inside a list, and changes buried among unchanged items:

>>> DeepDiff([1, 2, 3], [1, 2.0, 3])   # {}  (expected type_change at root[1])
>>> DeepDiff([1], [Decimal(1)])        # {}
>>> DeepDiff([True], [1])              # {}

Root cause

For ordered iterables whose items are all basic-hashable, _diff_iterable_in_order diffs the two sequences with difflib.SequenceMatcher. Because 2 == 2.0 (and hash(2) == hash(2.0)), SequenceMatcher classifies the pair as an 'equal' opcode:

>>> import difflib
>>> difflib.SequenceMatcher(None, [2], [2.0], autojunk=False).get_opcodes()
[('equal', 0, 1, 0, 1)]

The 'equal' branch in _diff_ordered_iterable_by_difflib recorded the opcode and continued — the items were never handed to _diff(), which is the only place the type-change check get_type(t1) != get_type(t2) lives. So the difference vanished. Scalars and dict values go through _diff() directly, which is why they behave correctly.

This is the ordered-diff complement of the ignore_order=True int-vs-float fix already noted in the v9.0.0 changelog.

The fix

In the 'equal' opcode branch, recurse into _diff() for any aligned pair whose types differ:

if tag == 'equal':
    opcodes_with_values.append(Opcode(...))
    for index in range(t1_to_index - t1_from_index):
        x = level.t1[t1_from_index + index]
        y = level.t2[t2_from_index + index]
        if get_type(x) != get_type(y):
            change_level = level.branch_deeper(x, y, ...)
            self._diff(change_level, parents_ids, local_tree=local_tree)
    continue

Routing through _diff() reuses the existing type-change machinery, so every option is honored automatically — no duplicated logic.

Before / after

Input Before After
DeepDiff([2], [2.0]) {} type_changes root[0]: int→float
DeepDiff([1,2,3], [1,2.0,3]) {} type_changes root[1]: int→float
DeepDiff([1], [Decimal(1)]) {} type_changes root[0]: int→Decimal
DeepDiff([2], [2.0], ignore_numeric_type_changes=True) {} {} (unchanged — correctly suppressed)
DeepDiff([2], [2]) {} {} (unchanged — no over-reporting)

Tests

Added three regression tests in tests/test_diff_text.py (repo style):

  • test_item_type_change_in_list[2] vs [2.0] reports the change and asserts it matches the scalar result
  • test_item_type_change_in_list_among_unchanged_items[1,2,3] vs [1,2.0,3]
  • test_item_type_change_in_list_ignored_when_requestedignore_numeric_type_changes=True still yields {}

Proof they guard the bug: with the source change stashed, the two positive tests FAIL; restored, all three PASS.

Full suite unchanged from baseline: 1251 passed, 44 skipped (the only failures locally are missing-optional-dep / macOS setrlimit env artifacts, identical with and without this change).

DeepDiff([2], [2.0]) returned {} even though the scalar equivalent
DeepDiff(2, 2.0) and the dict equivalent DeepDiff({'a': 2}, {'a': 2.0})
both correctly report an int -> float type_change.

Root cause: for ordered iterables whose items are all basic-hashable,
DeepDiff diffs the two sequences with difflib.SequenceMatcher. Because
2 == 2.0 (and hash(2) == hash(2.0)), SequenceMatcher classifies the pair
as an 'equal' opcode, so the elements are never handed to _diff() -- the
only place the type-change check (get_type(t1) != get_type(t2)) lives.
The whole difference silently disappeared. The same happened for other
equal-but-differently-typed pairs such as [1] vs [Decimal(1)] and
[True] vs [1], and for changes buried among unchanged items, e.g.
[1, 2, 3] vs [1, 2.0, 3].

Fix: in the 'equal' opcode branch of _diff_ordered_iterable_by_difflib,
recurse into _diff() for any aligned pair whose types differ. Routing
through _diff() reuses the existing type-change logic, so all options are
honored automatically -- ignore_numeric_type_changes=True and
ignore_type_in_groups still suppress the report, and genuinely identical
items still produce nothing.

Added regression tests (fail before this change, pass after):
- [2] vs [2.0] reports the int -> float type_change and matches the
  scalar result
- the change is detected among surrounding unchanged items
- ignore_numeric_type_changes=True still yields {}

Full test suite: 1251 passed, unchanged from baseline.

Fixes qlustered#605
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant