Python metaclasses
Ionel Cristian Mărieș has a great explanation of Python metaclasses. He thinks it's the best around (at least at the time he wrote it 5 years ago), and I mostly agree, but it doesn't quite tickle my itch for an intuition of metaclasses. After some study and experimentation, I want to share my own.
Here are some brief takeaways that I explain in this post:
- A class definition is a statement that constructs a class object and binds it to a variable in the local scope.
- A class object is a callable. Calling a class object typically returns an instance of that class.
- Every class object is an instance of its metaclass. The type of a class object is its metaclass.
- A class body is a block of statements, just like a function body. At the end of that block, the variables in the local scope become attributes of the class object.
- Metaclasses let us change the way a class definition constructs a class object. We can even make a class definition construct a value that is not a class object.
- Metaclasses let us change how instances of a class are constructed. We can even make a call of a class object return a value that is not an instance of that class.
Class definitions are assignment statements
python
, the program, is an interpreter that follows a path through its
own code and modifies its own internal state as it reads your code. The
content of your code affects the path that it takes and the modifications that
it makes. Consider a simple assignment in your code:
a = 1
This assignment instructs python
, the program, to compute the value of the
expression on the right-hand side of the assignment operator (=
), and then
add or update a binding, in the current scope, associating the identifier a
with that value.
Similarly, a class definition in Python, the language, is a set of
instructions for python
, the program. A class definition instructs python
to compute a class object and then add or update a binding, in the
current scope, assocating the name of the class with that class object. In
other words, a class definition in Python is a statement[1], and an
assignment statement at that.
Class definition syntax
To discuss how a class definition statement is executed, it helps to have names for its different parts. Consider the example class definition below. Most class definitions I read or write in the wild don't explicitly exercise all of these parts, but a metaclass author needs to understand them all.
The first line is the class header. It has four parts:
- A class name.
- Zero or more base classes. If unspecified, the default is
(object,)
. - An optional metaclass. To keep it simple, we'll say the default is
type
.[2] - Zero or more additional keyword arguments. If unspecified, the default
is an empty mapping. Note that
metaclass
is a special keyword argument not included in this mapping.
The rest of the lines, indented one level, form the class body. It is a block of statements, any statements, but a few statements are treated specially:
- An optional docstring. For a string literal to be a docstring, it must be the first statement in the body.
- Zero or more assignments or variable annotations. In this context, function and class definitions are forms of assignments: they assign a function object or class object to their function name or class name, respectively.
Calling a callable
There's one more thing we need to talk about before getting into metaclasses.
When you call a "callable" in Python, the expression x(...)
is evaluated as
if it were x.__call__(...)
. The interpreter looks up the __call__
method
in the class hierarchy of type(x)
and, like any other method, calls it with
the object (x
) as the first argument. In other words, x(...)
is
equivalent to type(x).__call__(x, ...)
.
If x
is a function object, then type(x)
is a built-in function class with
a __call__
method that does what you expect.
If x
is a class object, then calling x(...)
is how we instantiate an
object of type x
. It works just like calling any other callable, by calling
type(x).__call__(x, ...)
. In that expression, type(x)
is the metaclass of
x
, because every class object is an instance of its metaclass.
Metaclasses hook into class definition execution
If you ever read the official Python tutorial (I hadn't before this post), it explains that a class definition is a statement and gives a shallow summary of how it is executed. Every class has a metaclass, and the metaclass plays a key role in the execution of the class definition. In other words, metaclasses enable you to change the way a class definition is executed---even whether the value it computes is a class object at all![3]
Let's walk through the execution of a class definition, recalling the example
above for class Example
that specifies metaclass=Meta
. There are three
phases to executing a class definition: before, during, and after the class
body.
Before the class body: prepare the namespace
- After parsing the class header, the interpeter calls
Meta.__prepare__(name, bases, **kwargs)
wherename='Example'
,bases=(base1, base2)
, andkwargs={'kw1': 1, 'kw2': 2})
. That call must return a mapping called the namespace. The default implementation returns an emptydict
, but a metaclass can return a different type as long as it has__getitem__
and__setitem__
. - The interpreter calls
namespace.__setitem__('__module__', __name__)
. Remember that__name__
is astr
, the name of the current module. - The interpreter calls
namespace.__setitem__('__qualname__', 'Example')
.
During the class body: fill the namespace
- If there is at least one variable annotation in the body, then the interpreter
calls
namespace.__setitem__('__annotations__', {})
. That value is an emptydict
. It will be filled with annotations keyed by variable name by the time the interpreter exits the class body. In our example, the final state of the annotations mapping will be{'attr1': str}
. - If the class has a docstring, then the interpreter calls
namespace.__setitem__('__doc__', docstring)
. - The interpreter executes every statement in the class body as if it were
any other block, e.g. a function body. Assignments, including function and
class definitions, create and update bindings in the local scope, and for
the class body that scope is represented by the namespace. Thus, for each
assignment in the class body, the interpreter calls
namespace.__setitem__(name, value)
. Our example creates bindings in the namespace forattr2
andmethod
.attr1
has a variable annotation, but not an assignment, so it never adds a binding to the namespace.
After the class body: construct the class object
-
The interpreter calls
Meta(name, bases, namespace, **kwargs)
. That is, it tries to construct an instance of classMeta
, which would be a class object with metaclassMeta
.name
,bases
, andkwargs
are the same as when they were passed toMeta.__prepare__
. In our example,namespace
looks like this:{
'__module__': '__main__',
'__qualname__': 'Example',
'__annotations__': {
'attr1': str
},
'__doc__': 'An example class.',
'attr2': 0,
'method': <function method at 0x12345678>
}Remember that this call of
Meta
translates totype(Meta).__call__(Meta, name, bases, namespace, **kwargs)
wheretype(Meta)
is the metaclass ofMeta
. In practice, most metaclasses will havetype
as their their metaclass because it is the default metaclass. If that is the case forMeta
, then this is a call oftype.__call__
, which has an implementation that resembles this:def __call__(metacls, name, bases, namespace, **kwargs):
cls = metacls.__new__(metacls, name, bases, namespace, **kwargs)
if isinstance(cls, metacls):
metacls.__init__(cls, name, bases, namespace, **kwargs)
return clsNote the calls to
metacls.__new__
andmetacls.__init__
. If the metaclass does not override these methods, then they must be found on a superclass. In practice, most metaclasses will havetype
as their base class[4] because it is the prototypical metaclass. Thus, the default implementation of these methods come fromtype
:-
type.__new__
constructs a class object---call itcls
---and sets its attributes:cls.__name__
comes from the second parameter,name
.cls.__module__
comes fromnamespace['__module__']
. If that key is missing, then the default is the module of the calling scope.cls.__qualname__
comes fromnamespace['__qualname__']
. If that key is missing, then the default iscls.__name__
.
-
type.__init__
does nothing.
Note the implications here:
-
The
__call__
method of a metaclass is used to construct instances of classes defined with that metaclass. If a metaclass inherits from the default metaclasstype
without overriding__call__
, then it will inherit the implementation fromtype
that I shared above. Well, not quite that implementation, but if we change some variable names in that listing, we can see how it generalizes:def __call__(cls, *args, **kwargs):
obj = cls.__new__(cls, *args, **kwargs)
if isinstance(obj, cls):
cls.__init__(obj, *args, **kwargs)
return obj -
If you want a class definition to build a value that is not a class object, you can do that by overriding the
__new__
method of the metaclass for that definition, making it return whatever you want.[5] -
If you want to change what it means to instantiate a class defined with your metaclass, you can do that by overriding the
__call__
method of your metaclass. You can even change it to return a value that is not an instance of the given class at all.
-
A metaclass template
Here's an example metaclass that behaves exactly like the default metaclass,
type
, without deriving from type
. I find this helps me understand,
concisely, the capabilities and responsibilities of metaclasses. In practice,
I always derive my metaclasses from type
, and overload only those methods
whose behaviors I need to change.
class Meta:
@classmethod
def __prepare__(metacls, name, bases, **kwargs):
assert issubclass(metacls, Meta)
return {}
def __new__(metacls, name, bases, namespace, **kwargs):
"""Construct a class object for a class whose metaclass is Meta."""
assert issubclass(metacls, Meta)
cls = type.__new__(metacls, name, bases, namespace)
return cls
def __init__(cls, name, bases, namespace, **kwargs):
assert isinstance(cls, Meta)
def __call__(cls, *args, **kwargs):
"""Construct an instance of a class whose metaclass is Meta."""
assert isinstance(cls, Meta)
obj = cls.__new__(cls, *args, **kwargs)
if isinstance(obj, cls):
cls.__init__(obj, *args, **kwargs)
return obj
Footnotes
Compare to class definitions in compiled languages like C++ or Java, where they are declarations. ↩︎
The default metaclass is the most-derived class among the metaclasses of the base classes, and an ambiguity raises a
TypeError
. The default base classobject
has the metaclasstype
. Ionel has the full explanation. ↩︎Metaclasses don't let you change the effect of binding the class name to the computed value, though. Nothing can change that. ↩︎
Yes, most metaclasses in practice have
type
as both their base class and their metaclass. ↩︎Or you can override the
__call__
method of the metaclass's metaclass, but I don't recommend it. That will surprise even metaclass authors. ↩︎