关于python:为什么数据类的类属性声明中不能有可变的默认值?

Why can't dataclasses have mutable defaults in their class attributes declaration?

这看起来像是之前可能问过的问题,但是一个小时左右的搜索没有得到任何结果。将默认列表参数传递给数据类看起来很有希望,但这并不是我想要的。

问题是:当试图将可变值赋给类属性时,会出现一个错误:

1
2
3
4
5
@dataclass
class Foo:
    bar: list = []

# ValueError: mutable default <class 'list'> for field a is not allowed: use default_factory

我从错误消息中收集到我应该使用以下内容:

1
2
3
@dataclass
class Foo:
    bar: list = field(default_factory=list)

但为什么不允许可变违约?是否强制避免可变默认参数问题?


看起来我的问题在文件中得到了很清楚的回答(如Shmee提到的,该文件源自PEP 557):

Python stores default member variable values in class attributes. Consider this example, not using dataclasses:

1
2
3
4
5
6
7
8
9
10
11
class C:
    x = []
    def add(self, element):
        self.x.append(element)

o1 = C()
o2 = C()
o1.add(1)
o2.add(2)
assert o1.x == [1, 2]
assert o1.x is o2.x

Note that the two instances of class C share the same class variable x, as expected.

Using dataclasses, if this code was valid:

1
2
3
4
5
@dataclass
class D:
    x: List = []
    def add(self, element):
        self.x += element

it would generate code similar to:

1
2
3
4
5
6
class D:
    x = []
    def __init__(self, x=x):
        self.x = x
    def add(self, element):
        self.x += element

This has the same issue as the original example using class C. That is, two instances of class D that do not specify a value for x when creating a class instance will share the same copy of x. Because dataclasses just use normal Python class creation they also share this behavior. There is no general way for Data Classes to detect this condition. Instead, dataclasses will raise a TypeError if it detects a default parameter of type list, dict, or set. This is a partial solution, but it does protect against many common errors.