JSON schema

JSON schema generation

JSON schema can be generated from data model. However, because of all possible customizations, the schema can differ between deserilialization and serialization. In common cases, deserialization_schema and serialization_schema will give the same result.

from dataclasses import dataclass

from apischema.json_schema import deserialization_schema, serialization_schema

class Foo:
    bar: str

assert deserialization_schema(Foo) == serialization_schema(Foo)
assert deserialization_schema(Foo) == {
    "$schema": "",
    "additionalProperties": False,
    "properties": {"bar": {"type": "string"}},
    "required": ["bar"],
    "type": "object",

Field alias

Sometimes dataclass field names can clash with a language keyword, sometimes the property name is not convenient. Hopefully, field can define an alias which will be used in schema and deserialization/serialization.

from dataclasses import dataclass, field

from apischema import alias, deserialize, serialize
from apischema.json_schema import deserialization_schema

class Foo:
    class_: str = field(metadata=alias("class"))

assert deserialization_schema(Foo) == {
    "$schema": "",
    "additionalProperties": False,
    "properties": {"class": {"type": "string"}},
    "required": ["class"],
    "type": "object",
assert deserialize(Foo, {"class": "bar"}) == Foo("bar")
assert serialize(Foo, Foo("bar")) == {"class": "bar"}

Alias all fields

Field aliasing can also be done at class level by specifying an aliasing function. This aliaser is applied to field alias if defined or field name, or not applied if override=False is specified.

from dataclasses import dataclass, field
from typing import Any

from apischema import alias
from apischema.json_schema import deserialization_schema

@alias(lambda s: f"foo_{s}")
class Foo:
    field1: Any
    field2: Any = field(metadata=alias(override=False))
    field3: Any = field(metadata=alias("field03"))
    field4: Any = field(metadata=alias("field04", override=False))

assert deserialization_schema(Foo) == {
    "$schema": "",
    "additionalProperties": False,
    "properties": {"foo_field1": {}, "field2": {}, "foo_field03": {}, "field04": {}},
    "required": ["foo_field1", "field2", "foo_field03", "field04"],
    "type": "object",

Class-level aliasing can be used to define a camelCase API.

Dynamic aliasing and default aliaser

apischema operations deserialize/serialize/deserialization_schema/serialization_schema provide an aliaser parameter which will be applied on every fields being processed in this operation.

Similar to strictness configuration, this parameter has a default value controlled by apischema.settings.aliaser.

It can be used for example to make all an application use camelCase. Actually, there is a shortcut for that:

Otherwise, it's used the same way than settings.coercer.

from apischema import settings

settings.camel_case = True


Dynamic aliaser ignores override=False

Schema annotations

Type annotations are not enough to express a complete schema, but apischema has a function for that; schema can be used both as type decorator or field metadata.

from dataclasses import dataclass, field
from typing import NewType

from apischema import schema
from apischema.json_schema import deserialization_schema

Tag = NewType("Tag", str)
schema(min_len=3, pattern=r"^\w*$", examples=["available", "EMEA"])(Tag)

class Resource:
    id: int
    tags: list[Tag] = field(
            description="regroup multiple resources", max_items=3, unique=True

assert deserialization_schema(Resource) == {
    "$schema": "",
    "additionalProperties": False,
    "properties": {
        "id": {"type": "integer"},
        "tags": {
            "description": "regroup multiple resources",
            "items": {
                "examples": ["available", "EMEA"],
                "minLength": 3,
                "pattern": "^\\w*$",
                "type": "string",
            "maxItems": 3,
            "type": "array",
            "uniqueItems": True,
            "default": [],
    "required": ["id"],
    "type": "object",


Schema are particularly useful with NewType. For example, if you use prefixed ids, you can use a NewType with a pattern schema to validate them, and benefit of more precise type checking.

The following keys are available (they are sometimes shorten compared to JSON schema original for code concision and snake_case):

Key JSON schema keyword type restriction
title / /
description / /
default / /
examples / /
min minimum int
max maximum int
exc_min exclusiveMinimum int
exc_max exclusiveMaximum int
mult_of multipleOf int
format / str
media_type contentMediaType str
encoding contentEncoding str
min_len minLength str
max_len maxLength str
pattern / str
min_items minItems list
max_items maxItems list
unique / list
min_props minProperties dict
max_props maxProperties dict


In case of field schema, field default value will be serialized (if possible) to add default keyword to the schema.

Constraints validation

JSON schema constrains the data deserialized; these constraints are naturally used for validation.

from dataclasses import dataclass, field
from typing import NewType

import pytest

from apischema import ValidationError, deserialize, schema

Tag = NewType("Tag", str)
schema(min_len=3, pattern=r"^\w*$", examples=["available", "EMEA"])(Tag)

class Resource:
    id: int
    tags: list[Tag] = field(
            description="regroup multiple resources", max_items=3, unique=True

with pytest.raises(ValidationError) as err:  # pytest check exception is raised
        Resource, {"id": 42, "tags": ["tag", "duplicate", "duplicate", "bad&", "_"]}
assert err.value.errors == [
    {"loc": ["tags"], "err": "item count greater than 3 (maxItems)"},
    {"loc": ["tags"], "err": "duplicate items (uniqueItems)"},
    {"loc": ["tags", 3], "err": "not matching pattern ^\\w*$ (pattern)"},
    {"loc": ["tags", 4], "err": "string length lower than 3 (minLength)"},


Error message are fully customizable

Extra schema

schema has two other arguments: extra and override, which give a finer control of the JSON schema generated: extra and override. It can be used for example to build "strict" unions (using oneOf instead of anyOf)

from dataclasses import dataclass
from typing import Annotated, Any

from apischema import schema
from apischema.json_schema import deserialization_schema

# schema extra can be callable to modify the schema in place
def to_one_of(schema: dict[str, Any]):
    if "anyOf" in schema:
        schema["oneOf"] = schema.pop("anyOf")

OneOf = schema(extra=to_one_of)

# or extra can be a dictionary which will update the schema
    extra={"$ref": "$defs/Foo"},
    override=True,  # override apischema generated schema, using only extra
class Foo:
    bar: int

# Use Annotated with OneOf to make a "strict" Union
assert deserialization_schema(Annotated[Foo | int, OneOf]) == {
    "$schema": "",
    "oneOf": [  # oneOf instead of anyOf
        {"$ref": "$defs/Foo"},
        {"type": "integer"},

Base schema

apischema.settings.base_schema can be used to define "base schema" for the different kind of objects: types, object fields or (serialized) methods.

from dataclasses import dataclass, field
from typing import Any, Callable, get_origin

import docstring_parser

from apischema import schema, serialized, settings
from apischema.json_schema import serialization_schema
from apischema.schemas import Schema
from apischema.type_names import get_type_name

class Foo:
    """Foo class

    :var bar: bar attribute"""

    bar: str = field(metadata=schema(max_len=10))

    def baz(self) -> int:
        """baz method"""

def type_base_schema(tp: Any) -> Schema | None:
    if not hasattr(tp, "__doc__"):
        return None
    return schema(

def field_base_schema(tp: Any, name: str, alias: str) -> Schema | None:
    title = alias.replace("_", " ").capitalize()
    tp = get_origin(tp) or tp  # tp can be generic
    for meta in docstring_parser.parse(tp.__doc__).meta:
        if meta.args == ["var", name]:
            return schema(title=title, description=meta.description)
    return schema(title=title)

def method_base_schema(tp: Any, method: Callable, alias: str) -> Schema | None:
    return schema(
        title=alias.replace("_", " ").capitalize(),

settings.base_schema.type = type_base_schema
settings.base_schema.field = field_base_schema
settings.base_schema.method = method_base_schema

assert serialization_schema(Foo) == {
    "$schema": "",
    "additionalProperties": False,
    "title": "Foo",
    "description": "Foo class",
    "properties": {
        "bar": {
            "description": "bar attribute",
            "title": "Bar",
            "type": "string",
            "maxLength": 10,
        "baz": {"description": "baz method", "title": "Baz", "type": "integer"},
    "required": ["bar", "baz"],
    "type": "object",

Base schema will be merged with schema defined at type/field/method level.

Required field with default value

By default, a dataclass/namedtuple field will be tagged required if it doesn't have a default value.

However, you may want to have a default value for a field in order to be more convenient in your code, but still make the field required. One could think about some schema model where version is fixed but is required, for example JSON-RPC with "jsonrpc": "2.0". That's done with field metadata required.

from dataclasses import dataclass, field

import pytest

from apischema import ValidationError, deserialize
from apischema.metadata import required

class Foo:
    bar: int | None = field(default=None, metadata=required)

with pytest.raises(ValidationError) as err:
    deserialize(Foo, {})
assert err.value.errors == [{"loc": ["bar"], "err": "missing property"}]

Additional properties / pattern properties

With Mapping

Schema of a Mapping/dict type is naturally translated to "additionalProperties": <schema of the value type>.

However when the schema of the key has a pattern, it will give a "patternProperties": {<key pattern>: <schema of the value type>}

With dataclass

additionalProperties/patternProperties can be added to dataclasses by using fields annotated with properties metadata. Properties not mapped on regular fields will be deserialized into this fields; they must have a Mapping type, or be deserializable from a Mapping, because they are instantiated with a mapping.

from import Mapping
from dataclasses import dataclass, field
from typing import Annotated

from apischema import deserialize, properties, schema
from apischema.json_schema import deserialization_schema

class Config:
    active: bool = True
    server_options: Mapping[str, bool] = field(
        default_factory=dict, metadata=properties(pattern=r"^server_")
    client_options: Mapping[
        Annotated[str, schema(pattern=r"^client_")], bool  # noqa: F722
    ] = field(default_factory=dict, metadata=properties(...))
    options: Mapping[str, bool] = field(default_factory=dict, metadata=properties)

assert deserialize(
    {"use_lightsaber": True, "server_auto_restart": False, "client_timeout": False},
) == Config(
    {"server_auto_restart": False},
    {"client_timeout": False},
    {"use_lightsaber": True},
assert deserialization_schema(Config) == {
    "$schema": "",
    "type": "object",
    "properties": {"active": {"type": "boolean", "default": True}},
    "additionalProperties": {"type": "boolean"},
    "patternProperties": {
        "^server_": {"type": "boolean"},
        "^client_": {"type": "boolean"},


Of course, a dataclass can only have a single properties field without pattern, because it makes no sens to have several additionalProperties.

Property dependencies

apischema supports property dependencies for dataclass through a class member. Dependencies are also used in validation.

from dataclasses import dataclass, field

import pytest

from apischema import ValidationError, dependent_required, deserialize
from apischema.json_schema import deserialization_schema
from apischema.skip import NotNull

class Billing:
    name: str
    # Fields used in dependencies MUST be declared with `field`
    credit_card: NotNull[int] = field(default=None)
    billing_address: NotNull[str] = field(default=None)

    dependencies = dependent_required({credit_card: [billing_address]})

# it can also be done outside the class with
# dependent_required({"credit_card": ["billing_address"]}, owner=Billing)

assert deserialization_schema(Billing) == {
    "$schema": "",
    "additionalProperties": False,
    "dependentRequired": {"credit_card": ["billing_address"]},
    "properties": {
        "name": {"type": "string"},
        "credit_card": {"type": "integer"},
        "billing_address": {"type": "string"},
    "required": ["name"],
    "type": "object",

with pytest.raises(ValidationError) as err:
    deserialize(Billing, {"name": "Anonymous", "credit_card": 1234_5678_9012_3456})
assert err.value.errors == [
        "loc": ["billing_address"],
        "err": "missing property (required by ['credit_card'])",

Because bidirectional dependencies are a common idiom, apischema provides a shortcut notation; it's indeed possible to write dependent_required([credit_card, billing_adress]).

JSON schema reference

For complex schema with type reuse, it's convenient to extract definitions of schema components in order to reuse them — it's even mandatory for recursive types; JSON schema use JSON pointers "$ref" to refer to the definitions. apischema handles this feature natively.

from dataclasses import dataclass
from typing import Optional

from apischema.json_schema import deserialization_schema

class Node:
    value: int
    child: Optional["Node"] = None

assert deserialization_schema(Node) == {
    "$schema": "",
    "$ref": "#/$defs/Node",
    "$defs": {
        "Node": {
            "type": "object",
            "properties": {
                "value": {"type": "integer"},
                "child": {
                    "anyOf": [{"$ref": "#/$defs/Node"}, {"type": "null"}],
                    "default": None,
            "required": ["value"],
            "additionalProperties": False,

Use reference only for reused types

apischema can control the reference use through the boolean all_ref parameter of deserialization_schema/serialization_schema:

  • all_refs=True -> all types with a reference will be put in the definitions and referenced with $ref;
  • all_refs=False -> only types which are reused in the schema are put in definitions

all_refs default value depends on the JSON schema version: it's False for JSON schema drafts but True for OpenAPI.

from dataclasses import dataclass

from apischema.json_schema import deserialization_schema

class Bar:
    baz: str

class Foo:
    bar1: Bar
    bar2: Bar

assert deserialization_schema(Foo, all_refs=False) == {
    "$schema": "",
    "$defs": {
        "Bar": {
            "additionalProperties": False,
            "properties": {"baz": {"type": "string"}},
            "required": ["baz"],
            "type": "object",
    "additionalProperties": False,
    "properties": {"bar1": {"$ref": "#/$defs/Bar"}, "bar2": {"$ref": "#/$defs/Bar"}},
    "required": ["bar1", "bar2"],
    "type": "object",
assert deserialization_schema(Foo, all_refs=True) == {
    "$schema": "",
    "$defs": {
        "Bar": {
            "additionalProperties": False,
            "properties": {"baz": {"type": "string"}},
            "required": ["baz"],
            "type": "object",
        "Foo": {
            "additionalProperties": False,
            "properties": {
                "bar1": {"$ref": "#/$defs/Bar"},
                "bar2": {"$ref": "#/$defs/Bar"},
            "required": ["bar1", "bar2"],
            "type": "object",
    "$ref": "#/$defs/Foo",

Set reference name

In the previous examples, types were referenced using their name. This is indeed the default behavior for every classes/NewTypes (except primitive int/str/bool/float).

It's possible to override the default reference name using apischema.type_name; passing None instead of a string will remove the reference, making the type unable to be referenced as a separate definition in the schema.

from dataclasses import dataclass
from typing import Annotated

from apischema import type_name
from apischema.json_schema import deserialization_schema

# Type name can be added as a decorator
class BaseResource:
    id: int
    # or using typing.Annotated
    tags: Annotated[set[str], type_name("ResourceTags")]

assert deserialization_schema(BaseResource, all_refs=True) == {
    "$schema": "",
    "$defs": {
        "Resource": {
            "type": "object",
            "properties": {
                "id": {"type": "integer"},
                "tags": {"$ref": "#/$defs/ResourceTags"},
            "required": ["id", "tags"],
            "additionalProperties": False,
        "ResourceTags": {
            "type": "array",
            "items": {"type": "string"},
            "uniqueItems": True,
    "$ref": "#/$defs/Resource",


Builtin collections are interchangeable when a type_name is registered. For example, if a name is registered for list[Foo], this name will also be used for Sequence[Foo] or Collection[Foo].

Generic aliases can have a type name, but they need to be specialized; Foo[T, int] cannot have a type name but Foo[str, int] can. However, generic classes can get a dynamic type name depending on their generic argument, passing a name factory to type_name:

from dataclasses import dataclass, field
from typing import Generic, TypeVar

from apischema import type_name
from apischema.json_schema import deserialization_schema
from apischema.metadata import flatten

T = TypeVar("T")

# Type name factory takes the type and its arguments as (positional) parameters
@type_name(lambda tp, arg: f"{arg.__name__}Resource")
class Resource(Generic[T]):
    id: int
    content: T = field(metadata=flatten)

class Foo:
    bar: str

assert deserialization_schema(Resource[Foo], all_refs=True) == {
    "$schema": "",
    "$ref": "#/$defs/FooResource",
    "$defs": {
        "FooResource": {
            "allOf": [
                    "type": "object",
                    "properties": {"id": {"type": "integer"}},
                    "required": ["id"],
                    "additionalProperties": False,
                {"$ref": "#/$defs/Foo"},
            "unevaluatedProperties": False,
        "Foo": {
            "type": "object",
            "properties": {"bar": {"type": "string"}},
            "required": ["bar"],
            "additionalProperties": False,

The default behavior can also be customized using apischema.settings.default_type_name:

Reference factory

In JSON schema, $ref looks like #/$defs/Foo, not just Foo. In fact, schema generation use the ref given by type_name/default_type_name and pass it to a ref_factory function (a parameter of schema generation functions) which will convert it to its final form. JSON schema version comes with its default ref_factory, for draft 2020-12, it prefixes the ref with #/$defs/, while it prefixes with #/components/schema in case of OpenAPI.

from dataclasses import dataclass

from apischema.json_schema import deserialization_schema

class Foo:
    bar: int

def ref_factory(ref: str) -> str:
    return f"{ref}.json#"

assert deserialization_schema(Foo, all_refs=True, ref_factory=ref_factory) == {
    "$schema": "",
    "$ref": "",


When ref_factory is passed in arguments, definitions are not added to the generated schema. That's because ref_factory would surely change definitions location, so there would be no interest to add them with a wrong location. These definitions can of course be generated separately with definitions_schema.

Definitions schema

Definitions schemas can also be extracted using apischema.json_schema.definitions_schema. It takes two lists deserialization/serialization of types (or tuple of type + dynamic conversion) and returns a dictionary of all referenced schemas.


This is especially useful when it comes to OpenAPI schema to generate the components section.

from dataclasses import dataclass

from apischema.json_schema import definitions_schema

class Bar:
    baz: int = 0

class Foo:
    bar: Bar

assert definitions_schema(deserialization=[list[Foo]], all_refs=True) == {
    "Foo": {
        "type": "object",
        "properties": {"bar": {"$ref": "#/$defs/Bar"}},
        "required": ["bar"],
        "additionalProperties": False,
    "Bar": {
        "type": "object",
        "properties": {"baz": {"type": "integer", "default": 0}},
        "additionalProperties": False,

JSON schema / OpenAPI version

JSON schema has several versions — OpenAPI is treated as a JSON schema version. If apischema natively use the last one: draft 2020-12, it is possible to specify a schema version which will be used for the generation.

from dataclasses import dataclass
from typing import Literal

from apischema.json_schema import (

class Bar:
    baz: int | None
    constant: Literal[0] = 0

class Foo:
    bar: Bar

assert deserialization_schema(Foo, all_refs=True) == {
    "$schema": "",
    "$ref": "#/$defs/Foo",
    "$defs": {
        "Foo": {
            "type": "object",
            "properties": {"bar": {"$ref": "#/$defs/Bar"}},
            "required": ["bar"],
            "additionalProperties": False,
        "Bar": {
            "type": "object",
            "properties": {
                "baz": {"type": ["integer", "null"]},
                "constant": {"type": "integer", "const": 0, "default": 0},
            "required": ["baz"],
            "additionalProperties": False,
assert deserialization_schema(
    Foo, all_refs=True, version=JsonSchemaVersion.DRAFT_7
) == {
    "$schema": "",
    # $ref is isolated in allOf + draft 7 prefix
    "allOf": [{"$ref": "#/definitions/Foo"}],
    "definitions": {  # not "$defs"
        "Foo": {
            "type": "object",
            "properties": {"bar": {"$ref": "#/definitions/Bar"}},
            "required": ["bar"],
            "additionalProperties": False,
        "Bar": {
            "type": "object",
            "properties": {
                "baz": {"type": ["integer", "null"]},
                "constant": {"type": "integer", "const": 0, "default": 0},
            "required": ["baz"],
            "additionalProperties": False,
assert deserialization_schema(Foo, version=JsonSchemaVersion.OPEN_API_3_1) == {
    # No definitions for OpenAPI, use definitions_schema for it
    "$ref": "#/components/schemas/Foo"  # OpenAPI prefix
assert definitions_schema(
    deserialization=[Foo], version=JsonSchemaVersion.OPEN_API_3_1
) == {
    "Foo": {
        "type": "object",
        "properties": {"bar": {"$ref": "#/components/schemas/Bar"}},
        "required": ["bar"],
        "additionalProperties": False,
    "Bar": {
        "type": "object",
        "properties": {
            "baz": {"type": ["integer", "null"]},
            "constant": {"type": "integer", "const": 0, "default": 0},
        "required": ["baz"],
        "additionalProperties": False,
assert definitions_schema(
    deserialization=[Foo], version=JsonSchemaVersion.OPEN_API_3_0
) == {
    "Foo": {
        "type": "object",
        "properties": {"bar": {"$ref": "#/components/schemas/Bar"}},
        "required": ["bar"],
        "additionalProperties": False,
    "Bar": {
        "type": "object",
        # "nullable" instead of "type": "null"
        "properties": {
            "baz": {"type": "integer", "nullable": True},
            "constant": {"type": "integer", "enum": [0], "default": 0},
        "required": ["baz"],
        "additionalProperties": False,

OpenAPI Discriminator

OpenAPI defines a discriminator object which can be used to shortcut deserialization of union of object types.

apischema provides two different ways to declare a discriminator:

  • as an Annotated metadata of a union ;

    from dataclasses import dataclass
    from typing import Annotated, Union
    import pytest
    from apischema import (
    from apischema.json_schema import deserialization_schema
    class Cat:
    class Dog:
    class Lizard:
    Pet = Annotated[Union[Cat, Dog, Lizard], discriminator("type", {"dog": Dog})]
    assert deserialize(Pet, {"type": "dog"}) == Dog()
    assert deserialize(Pet, {"type": "Cat"}) == Cat()
    assert serialize(Pet, Dog()) == {"type": "dog"}
    with pytest.raises(ValidationError) as err:
        assert deserialize(Pet, {"type": "not a pet"})
    assert err.value.errors == [
        {"loc": ["type"], "err": "not one of ['dog', 'Cat', 'Lizard'] (oneOf)"}
    assert deserialization_schema(Pet) == {
        "oneOf": [
            {"$ref": "#/$defs/Cat"},
            {"$ref": "#/$defs/Dog"},
            {"$ref": "#/$defs/Lizard"},
        "discriminator": {"propertyName": "type", "mapping": {"dog": "#/$defs/Dog"}},
        "$defs": {
            "Dog": {"type": "object", "additionalProperties": False},
            "Cat": {"type": "object", "additionalProperties": False},
            "Lizard": {"type": "object", "additionalProperties": False},
        "$schema": "",

  • as a decorator of base class.

    from dataclasses import dataclass
    from typing import Union
    from apischema import deserialize, discriminator, serialize
    from apischema.json_schema import deserialization_schema
    class Pet:
    class Cat(Pet):
    class Dog(Pet):
    data = {"type": "Dog"}
    assert deserialize(Pet, data) == deserialize(Union[Cat, Dog], data) == Dog()
    assert serialize(Pet, Dog()), serialize(Union[Cat, Dog], Dog()) == data
    assert deserialization_schema(Union[Cat, Dog]) == {
        "$schema": "",
        "anyOf": [{"$ref": "#/$defs/Cat"}, {"$ref": "#/$defs/Dog"}],
        "$defs": {
            "Pet": {
                "type": "object",
                "required": ["type"],
                "properties": {"type": {"type": "string"}},
                "discriminator": {"propertyName": "type"},
            "Cat": {"allOf": [{"$ref": "#/$defs/Pet"}, {"type": "object"}]},
            "Dog": {"allOf": [{"$ref": "#/$defs/Pet"}, {"type": "object"}]},


Using discriminator doesn't require to have a dedicated field (except for TypedDict)

Performance of union deserialization can be improved using discriminator.

readOnly / writeOnly

Dataclasses InitVar and field(init=False) fields will be flagged respectively with "writeOnly": true and "readOnly": true in the generated schema.

In definitions schema, if a type appears both in deserialization and serialization, properties are merged and the resulting schema contains then readOnly and writeOnly properties. By the way, the required is not merged because it can't (it would mess up validation if some not-init field was required), so deserialization required is kept because it's more important as it can be used in validation (OpenAPI 3.0 semantic which allows the merge has been dropped in 3.1, so it has not been judged useful to be supported)