Skip to content

Better support for descriptors at dataclass creation #144749

@MrXerios

Description

@MrXerios

Feature or enhancement

Proposal:

Hello, I propose a change to the way dataclass works with fields defined using a descriptor class, particularly the way a default value is assigned.

What is the issue

According to the docs:

To determine whether a field contains a default value, @dataclass will call the descriptor’s get() method using its class access form: descriptor.get(obj=None, type=cls). If the descriptor returns a value in this case, it will be used as the field’s default. On the other hand, if the descriptor raises AttributeError in this situation, no default value will be provided for the field.

This behaviour is an issue for descriptors that need to be referenced at class creation: a common pattern in that case is to define the __get__ method with a test so that the descriptor returns itself outside of a call on an instance of the class:

class Descriptor:
    ...
    def __get__(self, instance, instance_type=None):
        if instance is None:
            return self
        ...
    ...

In that case, when the dataclass decorator checks for a default value then self is returned, which means that:

  1. The field has a default value, when it probably shouldn't,
  2. The default value is a descriptor instance, which is almost definitely wrong.

Please note that the property builtin is such a descriptor, this is not an edge case. Here is an example using it:

from dataclasses import dataclass
from operator import attrgetter

@dataclass
class A:
    a: int = property(attrgetter('_a'))

    @a.setter
    def a(self, val):
        self._a = val

Then:

>>> A(1) # expected behaviour
A(a=1)

>>> A() # unexpected behaviour, but no exception is raised
A(a=<property object at 0x0000027C27023BA0>)

What could be changed ?

I propose adding a check to verify if __get__ returns the descriptor instance itself, to prevent it from using that instance as a default value. However, I am not well versed in the dataclass creation mecanisms, so I am not aware of any consequence it could have.

Is this a breaking change ?

Technically yes. I can't think of any applications where this behaviour could be used, and I don't know how to check if any project on github uses it, but it doesn't change the fact that it breaks backward compatibility.

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibStandard Library Python modules in the Lib/ directorytopic-dataclassestype-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions