Skip to content

Conversions – (de)serialization customization

apischema covers the majority of standard data types, but of course that's not enough, which is why it enables you to add support for all your classes and the libraries you use.

Actually, apischema itself uses this conversion feature to provide a basic support for standard library data types like UUID/datetime/etc. (see std_types.py)

ORM support can easily be achieved with this feature (see SQLAlchemy example).

In fact, you can even add support for competitor libraries like Pydantic (see Pydantic compatibility example)

Principle - apischema conversions

An apischema conversion is composed of a source type, let's call it Source, a target type Target and a converter function with signature (Source) -> Target.

When a class (actually, a non-builtin class, so not int/list/etc.) is deserialized, apischema will check if there is a conversion where this type is the target. If found, the source type of conversion will be deserialized, then the converter will be applied to get an object of the expected type. Serialization works the same way but inverted: look for a conversion with type as source, apply then converter, and get the target type.

Conversions are also handled in schema generation: for a deserialization schema, source schema is merged to target schema, while target schema is merged to source schema for a serialization schema.

Register a conversion

Conversion is registered using apischema.deserializer/apischema.serializer for deserialization/serialization respectively.

When used as function decorator, the Source/Target types are directly extracted from the conversion function signature.

serializer can be called on methods/properties, in which case Source type is inferred to be the owning type.

from dataclasses import dataclass

from apischema import deserialize, schema, serialize
from apischema.conversions import deserializer, serializer
from apischema.json_schema import deserialization_schema, serialization_schema


@schema(pattern=r"^#[0-9a-fA-F]{6}$")
@dataclass
class RGB:
    red: int
    green: int
    blue: int

    @serializer
    @property
    def hexa(self) -> str:
        return f"#{self.red:02x}{self.green:02x}{self.blue:02x}"


# serializer can also be called with methods/properties outside of the class
# For example, `serializer(RGB.hexa)` would have the same effect as the decorator above


@deserializer
def from_hexa(hexa: str) -> RGB:
    return RGB(int(hexa[1:3], 16), int(hexa[3:5], 16), int(hexa[5:7], 16))


assert deserialize(RGB, "#000000") == RGB(0, 0, 0)
assert serialize(RGB, RGB(0, 0, 42)) == "#00002a"
assert (
    deserialization_schema(RGB)
    == serialization_schema(RGB)
    == {
        "$schema": "http://json-schema.org/draft/2020-12/schema#",
        "type": "string",
        "pattern": "^#[0-9a-fA-F]{6}$",
    }
)

Warning

(De)serializer methods cannot be used with typing.NamedTuple; in fact, apischema uses the __set_name__ magic method but it is not called on NamedTuple subclass fields.

Multiple deserializers

Sometimes, you want to have several possibilities to deserialize a type. If it's possible to register a deserializer with a Union param, it's not very practical. That's why apischema make it possible to register several deserializers for the same type. They will be handled with a Union source type (ordered by deserializers registration), with the right serializer selected according to the matching alternative.

from dataclasses import dataclass

from apischema import deserialize, deserializer
from apischema.json_schema import deserialization_schema


@dataclass
class Expression:
    value: int


@deserializer
def evaluate_expression(expr: str) -> Expression:
    return Expression(int(eval(expr)))


# Could be shorten into deserializer(Expression), because class is callable too
@deserializer
def expression_from_value(value: int) -> Expression:
    return Expression(value)


assert deserialization_schema(Expression) == {
    "$schema": "http://json-schema.org/draft/2020-12/schema#",
    "type": ["string", "integer"],
}
assert deserialize(Expression, 0) == deserialize(Expression, "1 - 1") == Expression(0)

On the other hand, serializer registration overwrites the previous registration if any.

apischema.conversions.reset_deserializers/apischema.conversions.reset_serializers can be used to reset (de)serializers (even those of the standard types embedded in apischema)

Inheritance

All serializers are naturally inherited. In fact, with a conversion function (Source) -> Target, you can always pass a subtype of Source and get a Target in return.

Moreover, when serializer is a method/property, overriding this method/property in a subclass will override the inherited serializer.

from apischema import serialize, serializer


class Foo:
    pass


@serializer
def serialize_foo(foo: Foo) -> int:
    return 0


class Foo2(Foo):
    pass


# Deserializer is inherited
assert serialize(Foo, Foo()) == serialize(Foo2, Foo2()) == 0


class Bar:
    @serializer
    def serialize(self) -> int:
        return 0


class Bar2(Bar):
    def serialize(self) -> int:
        return 1


# Deserializer is inherited and overridden
assert serialize(Bar, Bar()) == 0 != serialize(Bar2, Bar2()) == 1

Note

Inheritance can also be toggled off in specific cases, like in the Class as union of its subclasses example

On the other hand, deserializers cannot be inherited, because the same Source passed to a conversion function (Source) -> Target will always give the same Target (not ensured to be the desired subtype).

Note

Pseudo-inheritance could be achieved by registering a conversion (using for example a classmethod) for each subclass in __init_subclass__ method (or a metaclass), or by using __subclasses__; see example

Generic conversions

Generic conversions are supported out of the box.

from typing import Generic, TypeVar

import pytest

from apischema import ValidationError, deserialize, serialize
from apischema.conversions import deserializer, serializer
from apischema.json_schema import deserialization_schema, serialization_schema

T = TypeVar("T")


class Wrapper(Generic[T]):
    def __init__(self, wrapped: T):
        self.wrapped = wrapped

    @serializer
    def unwrap(self) -> T:
        return self.wrapped


# Wrapper constructor can be used as a function too (so deserializer could work as decorator)
deserializer(Wrapper)


assert deserialize(Wrapper[list[int]], [0, 1]).wrapped == [0, 1]
with pytest.raises(ValidationError):
    deserialize(Wrapper[int], "wrapped")
assert serialize(Wrapper[str], Wrapper("wrapped")) == "wrapped"
assert (
    deserialization_schema(Wrapper[int])
    == {"$schema": "http://json-schema.org/draft/2020-12/schema#", "type": "integer"}
    == serialization_schema(Wrapper[int])
)

However, you're not allowed to register a conversion of a specialized generic type, like Foo[int].

Conversion object

In the previous example, conversions were registered using only converter functions. However, it can also be done by passing a apischema.conversions.Conversion instance. It allows specifying additional metadata to conversion (see next sections for examples) and precise converter source/target when annotations are not available.

from base64 import b64decode

from apischema import deserialize, deserializer
from apischema.conversions import Conversion

deserializer(Conversion(b64decode, source=str, target=bytes))
# Roughly equivalent to:
# def decode_bytes(source: str) -> bytes:
#     return b64decode(source)
# but saving a function call

assert deserialize(bytes, "Zm9v") == b"foo"

Dynamic conversions — select conversions at runtime

Whether or not a conversion is registered for a given type, conversions can also be provided at runtime, using the conversion parameter of deserialize/serialize/deserialization_schema/serialization_schema.

import os
import time
from dataclasses import dataclass
from datetime import datetime
from typing import Annotated

from apischema import deserialize, serialize
from apischema.metadata import conversion

# Set UTC timezone for example
os.environ["TZ"] = "UTC"
time.tzset()


def datetime_from_timestamp(timestamp: int) -> datetime:
    return datetime.fromtimestamp(timestamp)


date = datetime(2017, 9, 2)
assert deserialize(datetime, 1504310400, conversion=datetime_from_timestamp) == date


@dataclass
class Foo:
    bar: int
    baz: int

    def sum(self) -> int:
        return self.bar + self.baz

    @property
    def diff(self) -> int:
        return int(self.bar - self.baz)


assert serialize(Foo, Foo(0, 1)) == {"bar": 0, "baz": 1}
assert serialize(Foo, Foo(0, 1), conversion=Foo.sum) == 1
assert serialize(Foo, Foo(0, 1), conversion=Foo.diff) == -1
# conversions can be specified using Annotated
assert serialize(Annotated[Foo, conversion(serialization=Foo.sum)], Foo(0, 1)) == 1

Note

For definitions_schema, conversions can be added with types by using a tuple instead, for example definitions_schema(serializations=[(list[Foo], foo_to_bar)]).

The conversion parameter can also take a tuple of conversions, when you have a Union, a tuple or when you want to have several deserializations for the same type.

Dynamic conversions are local

Dynamic conversions are discarded after having been applied (or after class without conversion having been encountered). For example, you can't apply directly a dynamic conversion to a dataclass field when calling serialize on an instance of this dataclass. Reasons for this design are detailed in the FAQ.

import os
import time
from dataclasses import dataclass
from datetime import datetime

from apischema import serialize

# Set UTC timezone for example
os.environ["TZ"] = "UTC"
time.tzset()


def to_timestamp(d: datetime) -> int:
    return int(d.timestamp())


@dataclass
class Foo:
    bar: datetime


# timestamp conversion is not applied on Foo field because it's discarded
# when encountering Foo
assert serialize(Foo, Foo(datetime(2019, 10, 13)), conversion=to_timestamp) == {
    "bar": "2019-10-13T00:00:00"
}

# timestamp conversion is applied on every member of list
assert serialize(list[datetime], [datetime(1970, 1, 1)], conversion=to_timestamp) == [0]

Note

Dynamic conversion is not discarded when the encountered type is a container (list, dict, Collection, etc. or Union) or a registered conversion from/to a container; the dynamic conversion can then apply to the container elements

Dynamic conversions interact with type_name

Dynamic conversions are applied before looking for a ref registered with type_name

from dataclasses import dataclass

from apischema import type_name
from apischema.json_schema import serialization_schema


@dataclass
class Foo:
    pass


@dataclass
class Bar:
    pass


def foo_to_bar(_: Foo) -> Bar:
    return Bar()


type_name("Bars")(list[Bar])

assert serialization_schema(list[Foo], conversion=foo_to_bar, all_refs=True) == {
    "$schema": "http://json-schema.org/draft/2020-12/schema#",
    "$ref": "#/$defs/Bars",
    "$defs": {
        # Bars is present because `list[Foo]` is dynamically converted to `list[Bar]`
        "Bars": {"type": "array", "items": {"$ref": "#/$defs/Bar"}},
        "Bar": {"type": "object", "additionalProperties": False},
    },
}

Bypass registered conversion

Using apischema.identity as a dynamic conversion allows you to bypass a registered conversion, i.e. to (de)serialize the given type as it would be without conversion registered.

from dataclasses import dataclass

from apischema import identity, serialize, serializer
from apischema.conversions import Conversion


@dataclass
class RGB:
    red: int
    green: int
    blue: int

    @serializer
    @property
    def hexa(self) -> str:
        return f"#{self.red:02x}{self.green:02x}{self.blue:02x}"


assert serialize(RGB, RGB(0, 0, 0)) == "#000000"
# dynamic conversion used to bypass the registered one
assert serialize(RGB, RGB(0, 0, 0), conversion=identity) == {
    "red": 0,
    "green": 0,
    "blue": 0,
}
# Expended bypass form
assert serialize(
    RGB, RGB(0, 0, 0), conversion=Conversion(identity, source=RGB, target=RGB)
) == {"red": 0, "green": 0, "blue": 0}

Note

For a more precise selection of bypassed conversion, for tuple or Union member for example, it's possible to pass the concerned class as the source and the target of conversion with identity converter, as shown in the example.

Liskov substitution principle

LSP is taken into account when applying dynamic conversion: the serializer source can be a subclass of the actual class and the deserializer target can be a superclass of the actual class.

from dataclasses import dataclass

from apischema import deserialize, serialize


@dataclass
class Foo:
    field: int


@dataclass
class Bar(Foo):
    other: str


def foo_to_int(foo: Foo) -> int:
    return foo.field


def bar_from_int(i: int) -> Bar:
    return Bar(i, str(i))


assert serialize(Bar, Bar(0, ""), conversion=foo_to_int) == 0
assert deserialize(Foo, 0, conversion=bar_from_int) == Bar(0, "0")

Generic dynamic conversions

Generic dynamic conversions are supported out of the box. Also, contrary to registered conversions, partially specialized generics are allowed.

from collections.abc import Mapping, Sequence
from operator import itemgetter
from typing import TypeVar

from apischema import serialize
from apischema.json_schema import serialization_schema

T = TypeVar("T")
Priority = int


def sort_by_priority(values_with_priority: Mapping[T, Priority]) -> Sequence[T]:
    return [k for k, _ in sorted(values_with_priority.items(), key=itemgetter(1))]


assert serialize(
    dict[str, Priority], {"a": 1, "b": 0}, conversion=sort_by_priority
) == ["b", "a"]
assert serialization_schema(dict[str, Priority], conversion=sort_by_priority) == {
    "$schema": "http://json-schema.org/draft/2020-12/schema#",
    "type": "array",
    "items": {"type": "string"},
}

Field conversions

It is possible to register a conversion for a particular dataclass field using conversion metadata.

import os
import time
from dataclasses import dataclass, field
from datetime import datetime

from apischema import deserialize, serialize
from apischema.conversions import Conversion
from apischema.metadata import conversion

# Set UTC timezone for example
os.environ["TZ"] = "UTC"
time.tzset()

from_timestamp = Conversion(datetime.fromtimestamp, source=int, target=datetime)


def to_timestamp(d: datetime) -> int:
    return int(d.timestamp())


@dataclass
class Foo:
    some_date: datetime = field(metadata=conversion(from_timestamp, to_timestamp))
    other_date: datetime


assert deserialize(Foo, {"some_date": 0, "other_date": "2019-10-13"}) == Foo(
    datetime(1970, 1, 1), datetime(2019, 10, 13)
)
assert serialize(Foo, Foo(datetime(1970, 1, 1), datetime(2019, 10, 13))) == {
    "some_date": 0,
    "other_date": "2019-10-13T00:00:00",
}

Note

It's possible to pass a conversion only for deserialization or only for serialization

Serialized method conversions

Serialized methods can also have dedicated conversions for their return

import os
import time
from dataclasses import dataclass
from datetime import datetime

from apischema import serialize, serialized

# Set UTC timezone for example
os.environ["TZ"] = "UTC"
time.tzset()


def to_timestamp(d: datetime) -> int:
    return int(d.timestamp())


@dataclass
class Foo:
    @serialized(conversion=to_timestamp)
    def some_date(self) -> datetime:
        return datetime(1970, 1, 1)


assert serialize(Foo, Foo()) == {"some_date": 0}

Default conversions

As with almost every default behavior in apischema, default conversions can be configured using apischema.settings.deserialization.default_conversion/apischema.settings.serialization.default_conversion. The initial value of these settings are the function which retrieved conversions registered with deserializer/serializer.

You can for example support attrs classes with this feature:

from typing import Sequence

import attrs

from apischema import deserialize, serialize, settings
from apischema.json_schema import deserialization_schema
from apischema.objects import ObjectField

prev_default_object_fields = settings.default_object_fields


def attrs_fields(cls: type) -> Sequence[ObjectField] | None:
    if hasattr(cls, "__attrs_attrs__"):
        return [
            ObjectField(
                a.name, a.type, required=a.default == attrs.NOTHING, default=a.default
            )
            for a in getattr(cls, "__attrs_attrs__")
        ]
    else:
        return prev_default_object_fields(cls)


settings.default_object_fields = attrs_fields


@attrs.define
class Foo:
    bar: int


assert deserialize(Foo, {"bar": 0}) == Foo(0)
assert serialize(Foo, Foo(0)) == {"bar": 0}
assert deserialization_schema(Foo) == {
    "$schema": "http://json-schema.org/draft/2020-12/schema#",
    "type": "object",
    "properties": {"bar": {"type": "integer"}},
    "required": ["bar"],
    "additionalProperties": False,
}

apischema functions (deserialize/serialize/deserialization_schema/serialization_schema/definitions_schema) also have a default_conversion parameter to dynamically modify default conversions. See FAQ for the difference between conversion and default_conversion parameters.

Sub-conversions

Sub-conversions are dynamic conversions applied on the result of a conversion.

from dataclasses import dataclass
from typing import Generic, TypeVar

from apischema.conversions import Conversion
from apischema.json_schema import serialization_schema

T = TypeVar("T")


class Query(Generic[T]):
    ...


def query_to_list(q: Query[T]) -> list[T]:
    ...


def query_to_scalar(q: Query[T]) -> T | None:
    ...


@dataclass
class FooModel:
    bar: int


class Foo:
    def serialize(self) -> FooModel:
        ...


assert serialization_schema(
    Query[Foo], conversion=Conversion(query_to_list, sub_conversion=Foo.serialize)
) == {
    # We get an array of Foo
    "type": "array",
    "items": {
        "type": "object",
        "properties": {"bar": {"type": "integer"}},
        "required": ["bar"],
        "additionalProperties": False,
    },
    "$schema": "http://json-schema.org/draft/2020-12/schema#",
}

Sub-conversions can also be used to bypass registered conversions or to define recursive conversions.

Lazy/recursive conversions

Conversions can be defined lazily, i.e. using a function returning Conversion (single, or a tuple of it); this function must be wrapped into a apischema.conversions.LazyConversion instance.

It allows creating recursive conversions or using a conversion object which can be modified after its definition (for example a conversion for a base class modified by __init_subclass__)

It is used by apischema itself for the generated JSON schema. It is indeed a recursive data, and the different versions are handled by a conversion with a lazy recursive sub-conversion.

from dataclasses import dataclass

from apischema import serialize
from apischema.conversions import Conversion, LazyConversion


@dataclass
class Foo:
    elements: list["int | Foo"]


def foo_elements(foo: Foo) -> list[int | Foo]:
    return foo.elements


# Recursive conversion pattern
tmp = None
conversion = Conversion(foo_elements, sub_conversion=LazyConversion(lambda: tmp))
tmp = conversion

assert serialize(Foo, Foo([0, Foo([1])]), conversion=conversion) == [0, [1]]
# Without the recursive sub-conversion, it would have been:
assert serialize(Foo, Foo([0, Foo([1])]), conversion=foo_elements) == [
    0,
    {"elements": [1]},
]

Lazy registered conversions

Lazy conversions can also be registered, but the deserialization target/serialization source has to be passed too.

from dataclasses import dataclass

from apischema import deserialize, deserializer, serialize, serializer
from apischema.conversions import Conversion


@dataclass
class Foo:
    bar: int


deserializer(
    lazy=lambda: Conversion(lambda bar: Foo(bar), source=int, target=Foo), target=Foo
)
serializer(
    lazy=lambda: Conversion(lambda foo: foo.bar, source=Foo, target=int), source=Foo
)

assert deserialize(Foo, 0) == Foo(0)
assert serialize(Foo, Foo(0)) == 0

Conversion helpers

String conversions

A common pattern of conversion concerns classes that have a string constructor and a __str__ method, for example standard types uuid.UUID, pathlib.Path, or ipaddress.IPv4Address. Using apischema.conversions.as_str will register a string-deserializer from the constructor and a string-serializer from the __str__ method. ValueError raised by the constructor is caught and converted to ValidationError.

import bson
import pytest

from apischema import Unsupported, deserialize, serialize
from apischema.conversions import as_str

with pytest.raises(Unsupported):
    deserialize(bson.ObjectId, "0123456789ab0123456789ab")
with pytest.raises(Unsupported):
    serialize(bson.ObjectId, bson.ObjectId("0123456789ab0123456789ab"))

as_str(bson.ObjectId)

assert deserialize(bson.ObjectId, "0123456789ab0123456789ab") == bson.ObjectId(
    "0123456789ab0123456789ab"
)
assert (
    serialize(bson.ObjectId, bson.ObjectId("0123456789ab0123456789ab"))
    == "0123456789ab0123456789ab"
)

Note

Previously mentioned standard types are handled by apischema using as_str.

ValueErrorCatching

Converters can be wrapped with apischema.conversions.catch_value_error in order to catch ValueError and reraise it as a ValidationError. It's notably used but as_str and other standard types.

Note

This wrapper is in fact inlined in deserialization, so it has better performance than writing the try-catch in the code.

Use Enum names

Enum subclasses are (de)serialized using values. However, you may want to use enumeration names instead, that's why apischema provides apischema.conversion.as_names to decorate Enum subclasses.

from enum import Enum

from apischema import deserialize, serialize
from apischema.conversions import as_names
from apischema.json_schema import deserialization_schema, serialization_schema


@as_names
class MyEnum(Enum):
    FOO = object()
    BAR = object()


assert deserialize(MyEnum, "FOO") == MyEnum.FOO
assert serialize(MyEnum, MyEnum.FOO) == "FOO"
assert (
    deserialization_schema(MyEnum)
    == serialization_schema(MyEnum)
    == {
        "$schema": "http://json-schema.org/draft/2020-12/schema#",
        "type": "string",
        "enum": ["FOO", "BAR"],
    }
)

Class as union of its subclasses

Object deserialization — transform function into a dataclass deserializer

apischema.objects.object_deserialization can convert a function into a new function taking a unique parameter, a dataclass whose fields are mapped from the original function parameters.

It can be used for example to build a deserialization conversion from an alternative constructor.

from apischema import deserialize, deserializer, type_name
from apischema.json_schema import deserialization_schema
from apischema.objects import object_deserialization


def create_range(start: int, stop: int, step: int = 1) -> range:
    return range(start, stop, step)


range_conv = object_deserialization(create_range, type_name("Range"))
# Conversion can be registered
deserializer(range_conv)
assert deserialize(range, {"start": 0, "stop": 10}) == range(0, 10)
assert deserialization_schema(range) == {
    "$schema": "http://json-schema.org/draft/2020-12/schema#",
    "type": "object",
    "properties": {
        "start": {"type": "integer"},
        "stop": {"type": "integer"},
        "step": {"type": "integer", "default": 1},
    },
    "required": ["start", "stop"],
    "additionalProperties": False,
}

Note

Parameters metadata can be specified using typing.Annotated, or be passed with parameters_metadata parameter, which is a mapping of parameter names as key and mapped metadata as value.

Object serialization — select only a subset of fields

apischema.objects.object_serialization can be used to serialize only a subset of an object fields and methods.

from dataclasses import dataclass
from typing import Any

from apischema import alias, serialize, type_name
from apischema.json_schema import JsonSchemaVersion, definitions_schema
from apischema.objects import get_field, object_serialization


@dataclass
class Data:
    id: int
    content: str

    @property
    def size(self) -> int:
        return len(self.content)

    def get_details(self) -> Any:
        ...


# Serialization fields can be a str/field or a function/method/property
size_only = object_serialization(
    Data, [get_field(Data).id, Data.size], type_name("DataSize")
)
# ["id", Data.size] would also work


def complete_data():
    return [
        ...,  # shortcut to include all the fields
        Data.size,
        (Data.get_details, alias("details")),  # add/override metadata using tuple
    ]


# Serialization fields computation can be deferred in a function
# The serialization name will then be defaulted to the function name
complete = object_serialization(Data, complete_data)

data = Data(0, "data")
assert serialize(Data, data, conversion=size_only) == {"id": 0, "size": 4}
assert serialize(Data, data, conversion=complete) == {
    "id": 0,
    "content": "data",
    "size": 4,
    "details": None,  # because get_details return None in this example
}


assert definitions_schema(
    serialization=[(Data, size_only), (Data, complete)],
    version=JsonSchemaVersion.OPEN_API_3_0,
) == {
    "DataSize": {
        "type": "object",
        "properties": {"id": {"type": "integer"}, "size": {"type": "integer"}},
        "required": ["id", "size"],
        "additionalProperties": False,
    },
    "CompleteData": {
        "type": "object",
        "properties": {
            "id": {"type": "integer"},
            "content": {"type": "string"},
            "size": {"type": "integer"},
            "details": {},
        },
        "required": ["id", "content", "size", "details"],
        "additionalProperties": False,
    },
}

FAQ

What's the difference between conversion and default_conversion parameters?

Dynamic conversions (conversion parameter) exists to ensure consistency and reuse of subschemas referenced (with a $ref) in the JSON/OpenAPI schema.

In fact, different global conversions (default_conversion parameter) could lead to having a field with different schemas depending on global conversions, so a class would not be able to be referenced consistently. Because dynamic conversions are local, they cannot mess with an object field schema.

Schema generation uses the same default conversions for all definitions (which can have associated dynamic conversion).

default_conversion parameter allows having different (de)serialization contexts, for example to map date to string between frontend and backend, and to timestamp between backend services.