Skip to content

Data model

apischema handle every classes/types you need.

By the way, it's done in an additive way, meaning that it doesn't affect your types.

PEP 585

With Python 3.9 and PEP 585, typing is substantially shaken up; all container types of typing module are now deprecated.

apischema fully support 3.9 and PEP 585, as shown in the different examples. However, typing containers can still be used, especially/necessarily when using an older version.

Dataclasses

Because the library aims to bring the minimum boilerplate, it's build on the top of standard library. Dataclasses are thus the core structure of the data model.

Dataclasses bring the possibility of field customization, with more than just a default value. In addition to the common parameters of dataclasses.field, customization is done with metadata parameter; metadata can also be passed using PEP 593 typing.Annotated.

With some teasing of features presented later:

from dataclasses import dataclass, field
from typing import Annotated

from apischema import alias, schema
from apischema.metadata import required


@dataclass
class Foo:
    bar: int = field(
        default=0,
        metadata=alias("foo_bar") | schema(title="foo! bar!", min=0, max=42) | required,
    )
    baz: Annotated[
        int, alias("foo_baz"), schema(title="foo! baz!", min=0, max=32), required
    ] = 0
    # pipe `|` operator can also be used in Annotated

Note

Field's metadata are just an ordinary dict; apischema provides some functions to enrich these metadata with it's own keys (alias("foo_bar) is roughly equivalent to `{"_apischema_alias": "foo_bar"}) and use them when the time comes, but metadata are not reserved to apischema and other keys can be added.

Because PEP 584 is painfully missing before Python 3.9, apischema metadata use their own subclass of dict just to add | operator for convenience in all Python versions.

Dataclasses __post_init__ and field(init=False) are fully supported. Implication of this feature usage is documented in the relative sections.

Warning

Before 3.8, InitVar is doing type erasure, that's why it's not possible for apischema to retrieve type information of init variables. To fix this behavior, a field metadata init_var can be used to put back the type of the field (init_var also accepts stringified type annotations).

Dataclass-like types (attrs/SQLAlchemy/etc.) can also get support with a few lines of code, see next section

Standard library types

apischema handle natively most of the types provided by the standard library. They are sorted in the following categories:

Primitive

str, int, float, bool, None, subclasses of them

They correspond to JSON primitive types.

Collection

  • collection.abc.Collection (typing.Collection)
  • collection.abc.Sequence (typing.Sequence)
  • tuple (typing.Tuple)
  • collection.abc.MutableSequence (typing.MutableSequence)
  • list (typing.List)
  • collection.abc.Set (typing.AbstractSet)
  • collection.abc.MutableSet (typing.MutableSet)
  • frozenset (typing.FrozenSet)
  • set (typing.Set)

They correspond to JSON array and are serialized to list.

Some of them are abstract; deserialization will instantiate a concrete child class. For example collection.abc.Sequence will be instantiated with tuple while collection.MutableSequence will be instantiated with list.

Mapping

  • collection.abc.Mapping (typing.Mapping)
  • collection.abc.MutableMapping (typing.MutableMapping)
  • dict (typing.Dict)

They correpond to JSON object and are serialized to dict.

Enumeration

enum.Enum subclasses, typing.Literal

For Enum, this is the value and not the attribute name that is serialized

Typing facilities

  • typing.Optional/typing.Union (Optional[T] is strictly equivalent to Union[T, None])

: Deserialization select the first matching alternative (see below how to skip some union member)

  • tuple (typing.Tuple)

: Can be used as collection as well as true tuple, like tuple[str, int]

  • typing.NewType

: Serialized according to its base type

  • typing.NamedTuple

: Handled as an object type, roughly like a dataclass; fields metadata can be passed using Annotated

  • typing.TypedDict

: Hanlded as an object type, but with a dictionary shape; fields metadata can be passed using Annotated

  • typing.Any

: Untouched by deserialization

Other standard library types

  • bytes

: with str (de)serialization using base64 encoding

  • datetime.datetime
  • datetime.date
  • datetime.time

: Supported only in 3.7+ with fromisoformat/isoformat

  • Decimal

: With float (de)serialization

  • ipaddress.IPv4Address
  • ipaddress.IPv4Interface
  • ipaddress.IPv4Network
  • ipaddress.IPv6Address
  • ipaddress.IPv6Interface
  • ipaddress.IPv6Network
  • pathlib.Path
  • re.Pattern (typing.Pattern)
  • uuid.UUID

: With str (de)serialization

Generic

typing.Generic can be used out of the box like in the following example:

from dataclasses import dataclass
from typing import Generic, TypeVar

from pytest import raises

from apischema import ValidationError, deserialize

T = TypeVar("T")


@dataclass
class Box(Generic[T]):
    content: T


assert deserialize(Box[str], {"content": "void"}) == Box("void")
with raises(ValidationError):
    deserialize(Box[str], {"content": 42})

Recursive types, string annotations and PEP 563

Warning

To resolve annotations, apischema uses typing.get_type_hints; this doesn't work really well when used on objects defined outside of global scope.

Warning (minor)

Currently, PEP 585 can have surprising behavior when used outside the box, see bpo-41370

null vs undefined

Contrary to Javascript, Python doesn't have an undefined equivalent (if we consider None to be null equivalent). But it can be useful to distinguish (especially when thinkinn about HTTP PATCH method) between a null field and an undefined/absent field.

That's why apischema provides an Undefined constant (a single instance of UndefinedType class) which can be used as a default value everywhere where this distinction is needed. In fact, default values are used when field are absent, thus a default Undefined will mark the field as absent.

Dataclass/NamedTuple fields are ignored by serialization when Undefined.

from dataclasses import dataclass
from typing import Union

from apischema import Undefined, UndefinedType, deserialize, serialize
from apischema.json_schema import deserialization_schema


@dataclass
class Foo:
    bar: Union[int, UndefinedType] = Undefined
    baz: Union[int, UndefinedType, None] = Undefined


assert deserialize(Foo, {"bar": 0, "baz": None}) == Foo(0, None)
assert deserialize(Foo, {}) == Foo(Undefined, Undefined)
assert serialize(Foo, Foo(Undefined, 42)) == {"baz": 42}
# Foo.bar and Foo.baz are not required
assert deserialization_schema(Foo) == {
    "$schema": "http://json-schema.org/draft/2019-09/schema#",
    "type": "object",
    "properties": {"bar": {"type": "integer"}, "baz": {"type": ["integer", "null"]}},
    "additionalProperties": False,
}

Note

UndefinedType must only be used inside an Union, as it has no sense as a standalone type. By the way, no suitable name was found to shorten Union[T, UndefinedType] but propositions are welcomed.

Note

Undefined is a falsy constant, i.e. bool(Undefined) is False.

Annotated - PEP 593

PEP 593 is fully supported; annotations stranger to apischema are simply ignored.

Skip Union member

Sometimes, one of the Union members has not to be taken in account during validation; it can just be here to match the type of the default value of the field. This member can be marked as to be skipped with PEP 593 Annotated and apischema.skip.Skip

from dataclasses import dataclass
from typing import Annotated, Union

from pytest import raises

from apischema import ValidationError, deserialize, serialize
from apischema.skip import Skip


@dataclass
class Foo:
    bar: Union[int, Annotated[None, Skip]] = None


with raises(ValidationError) as err:
    deserialize(Foo, {"bar": None})
assert serialize(err.value) == [
    {"loc": ["bar"], "err": ["expected type integer, found null"]}
]

Optional vs. NotNull

Optional type is not always appropriate, because it allows deserialized value to be null, but sometimes, you just want None as a default value for unset fields, not an authorized one.

That's why apischema defines a NotNull type; in fact, NotNull = Union[T, Annotated[None, Skip]].

from dataclasses import dataclass

from pytest import raises

from apischema import ValidationError, deserialize, serialize
from apischema.skip import NotNull


@dataclass
class Foo:
    # NotNull is exactly like Optional for type checkers,
    # it's only interpreted differently by apischema
    bar: NotNull[int] = None


with raises(ValidationError) as err:
    deserialize(Foo, {"bar": None})
assert serialize(err.value) == [
    {"loc": ["bar"], "err": ["expected type integer, found null"]}
]

Note

You can also use Undefined, but it can be more convenient to directly manipulate an Optional field, especially in the rest of the code unrelated to (de)serialization.

Custom types

apischema can support almost all of your custom types in a few lines of code, using conversion feature. However, it also provides a simple and direct way to support dataclass-like types, as presented below.

Otherwise, when apischema encounters a type that it doesn't support, apischema.Unsupported exception will be raised.

Dataclass-like types, aka object types

Internally, apischema handle standard object types — dataclasses, named tuple and typed dictionary — the same way by mapping them to a set of apischema.objects.ObjectField, which has the following definition:

@dataclass(frozen=True)
class ObjectField:
    name: str  # field's name
    type: Any  # field's type
    required: bool = True  # if the field is required
    metadata: Mapping[str, Any] = field(default_factory=dict)  # field's metadata 
    default: InitVar[Any] = ...  # field's default value
    default_factory: Optional[Callable[[], Any]] = None  # field's default factory
    kind: FieldKind = FieldKind.NORMAL  # NORMAL/READ_ONLY/WRITE_ONLY

Thus, support of dataclass-like types (attrs, SQLAlchemy traditional mappers, etc.) can be achieved by mapping the concerned class to its own list of ObjectFields; this is done using apischema.objects.set_object_fields.

from apischema import deserialize, serialize
from apischema.json_schema import deserialization_schema
from apischema.objects import ObjectField, set_object_fields


class Foo:
    def __init__(self, bar):
        self.bar = bar


set_object_fields(Foo, [ObjectField("bar", int)])
# Fields can also be passed in a factory
set_object_fields(Foo, lambda: [ObjectField("bar", int)])

foo = deserialize(Foo, {"bar": 0})
assert isinstance(foo, Foo) and foo.bar == 0
assert serialize(Foo, Foo(0)) == {"bar": 0}
assert deserialization_schema(Foo) == {
    "$schema": "http://json-schema.org/draft/2019-09/schema#",
    "type": "object",
    "properties": {"bar": {"type": "integer"}},
    "required": ["bar"],
    "additionalProperties": False,
}

Another way to set object fields is to directly modify apischema default behavior, using apischema.settings.default_object_fields.

from collections.abc import Sequence
from typing import Optional
from apischema import settings
from apischema.objects import ObjectField

previous_default_object_fields = settings.default_object_field


def default_object_fields(cls) -> Optional[Sequence[ObjectField]]:
    return [...] if ... else previous_default_object_fields(cls)


settings.default_object_fields = default_object_fields

Note

Almost every default behavior of apischema can be customized using apischema.settings.

Examples of SQLAlchemy support and attrs support illustrate both methods (which could also be combined).

Skip field

Dataclass fields can be excluded from apischema processing by using apischema.metadata.skip in the field metadata

from dataclasses import dataclass, field

from apischema.json_schema import deserialization_schema
from apischema.metadata import skip


@dataclass
class Foo:
    bar: int
    baz: str = field(default="baz", metadata=skip)


assert deserialization_schema(Foo) == {
    "$schema": "http://json-schema.org/draft/2019-09/schema#",
    "type": "object",
    "properties": {"bar": {"type": "integer"}},
    "required": ["bar"],
    "additionalProperties": False,
}   

Composition over inheritance - composed dataclasses flattening

Dataclass fields which are themselves dataclass can be "flattened" into the owning one by using flattened metadata. Then, when the class will be (de)serialized, "flattened" fields will be (de)serialized at the same level than the owning class.

from dataclasses import dataclass, field
from typing import Union

from apischema import Undefined, UndefinedType, alias, deserialize, serialize
from apischema.fields import with_fields_set
from apischema.json_schema import deserialization_schema
from apischema.metadata import flattened


@dataclass
class JsonSchema:
    title: Union[str, UndefinedType] = Undefined
    description: Union[str, UndefinedType] = Undefined
    format: Union[str, UndefinedType] = Undefined
    ...


@with_fields_set
@dataclass
class RootJsonSchema:
    schema: Union[str, UndefinedType] = field(
        default=Undefined, metadata=alias("$schema")
    )
    defs: list[JsonSchema] = field(default_factory=list, metadata=alias("$defs"))
    # This field schema is flattened inside the owning one
    json_schema: JsonSchema = field(default=JsonSchema(), metadata=flattened)


data = {
    "$schema": "http://json-schema.org/draft/2019-09/schema#",
    "title": "flattened example",
}
root_schema = RootJsonSchema(
    schema="http://json-schema.org/draft/2019-09/schema#",
    json_schema=JsonSchema(title="flattened example"),
)
assert deserialize(RootJsonSchema, data) == root_schema
assert serialize(RootJsonSchema, root_schema) == data
assert deserialization_schema(RootJsonSchema) == {
    "$schema": "http://json-schema.org/draft/2019-09/schema#",
    "$defs": {
        "JsonSchema": {
            "type": "object",
            "properties": {
                "title": {"type": "string"},
                "description": {"type": "string"},
                "format": {"type": "string"},
            },
            "additionalProperties": False,
        }
    },
    "type": "object",
    # It results in allOf + unevaluatedProperties=False
    "allOf": [
        # RootJsonSchema (without JsonSchema)
        {
            "type": "object",
            "properties": {
                "$schema": {"type": "string"},
                "$defs": {
                    "type": "array",
                    "items": {"$ref": "#/$defs/JsonSchema"},
                    "default": [],
                },
            },
            "additionalProperties": False,
        },
        # JonsSchema
        {"$ref": "#/$defs/JsonSchema"},
    ],
    "unevaluatedProperties": False,
}

Note

This feature use JSON schema draft 2019-09 unevaluatedProperties keyword. However, this keyword is removed when JSON schema is converted in a version that doesn't support it, like OpenAPI 3.0.

This feature is very convenient for building model by composing smaller components. If some kind of reuse could also be achieved with inheritance, it can be less practical when it comes to use it in code, because there is no easy way to build an inherited class when you have an instance of the super class ; you have to copy all the fields by hand. On the other hand, using composition (of flattened fields), it's easy to instantiate the class when the smaller component is just a field of it.

FAQ

Why Iterable is not handled with other collection type?

Iterable could be handled (actually, it was at the beginning), however, this doesn't really make sense from a data point of view. Iterable are computation objects, they can be infinite, etc. They don't correspond to a serialized data; Collection is way more appropriate in this context.

What happens if I override dataclass __init__?

apischema always assumes that dataclass __init__ can be called with all its fields as kwargs parameters. If that's no more the case after a modification of __init__ (what means if an exception is thrown when the constructor is called because of bad parameters), apischema treats then the class as not supported.