TIL: How to enforce default values on Python dataclass fields when given None
A few days ago, I was reminded once again that Python type hints don't actually enforce type safety1 — they act more as a form of documentation for human readers of your code.
This came up while using a dataclass with fields that are not Optional and have non-null default values:
from dataclasses import dataclass
@dataclass
class Dog(ReplaceNone):
name: str = "Spot"
If I create some Dog instances with different constructors:
Dog() # Dog(name='Spot')
Dog("Rex") # Dog(name='Rex')
Dog(None) # Dog(name=None)
I can see that:
- empty constructor sets the name to the default value
(<string>)constructor sets the name to<string>(None)constructor sets the name toNone
My problem was with the third one. If I wanted None to be a valid value, I would have defined the field as name: Optional[str] = None. What I want is a non-null field, but Python doesn't make that distinction.
NOTE: Languages like Scala with Option types2 behave the way I want.
A
Stringcan't be set toNone:case class Dog(name: String = "Spot") Dog().name // Spot: String Dog("Rex").name // Rex: String Dog(None).name // Compilation error, "Found: None.type, Required: String"But an
Option[String]can:case class Dog(name: Option[String] = Some("Spot")) Dog().name // Some(Spot): Option[String] Dog(Some("Rex")).name // Some(Rex): Option[String] Dog(None).name // None: Option[String]
I found a Stack Overflow answer by Jason to this exact problem:
def __post_init__(self):
# Loop through the fields
for field in fields(self):
# If there is a default and the value of the field is none we can assign a value
if not isinstance(field.default, dataclasses._MISSING_TYPE) and getattr(self, field.name) is None:
setattr(self, field.name, field.default)
Adding a __post__init__() method to a dataclass lets you define custom logic for the object creation. This implementation looks for fields with default values and assigns them even when the constructor is given None.
But instead of adding the method to Dog — and copying it into every dataclass where I want non-full fields — I added it to a new ReplaceNone dataclass that I can extend:
from dataclasses import _MISSING_TYPE, dataclass, fields
@dataclass
class ReplaceNone:
def __post_init__(self):
# Loop through the fields
for field in fields(self):
# If there is a default and the value of the field is none we can assign a value
if not isinstance(field.default, _MISSING_TYPE) and getattr(self, field.name) is None:
setattr(self, field.name, field.default)
@dataclass
class Dog(ReplaceNone):
name: str = "Spot"
Now if I create instances of the updated Dog dataclass:
Dog() # Dog(name='Spot')
Dog("Rex") # Dog(name='Rex')
Dog(None) # Dog(name='Spot')
I can see the third case behaving as intended — the None constructor argument is ignored in favor of the default value.