Improve Serialization Performance in Django Rest Framework

This article was originally published at on June 8, 2019. Do yourself a favor and read this article with proper syntax highlighting.

When a developer chooses Python, Django, or Django Rest Framework, it’s usually not because of its blazing fast performance. Python has always been the “comfortable” choice, the language you choose when you care more about ergonomics than skimming a few microseconds of some process.

There is nothing wrong with ergonomics. Most projects don’t really need that micro second performance boost, but they do need to ship quality code fast.

All of this doesn’t mean performance is not important. As this story taught us, major performance boosts can be gained with just a little attention, and a few small changes.

Model Serializer Performance

A while back we noticed very poor performance from one of our main API endpoints. The endpoint fetched data from a very large table, so we naturally assumed that the problem must be in the database.

When we noticed that even small data sets get poor performance, we started looking into other parts of the app. This journey eventually led us to Django Rest Framework (DRF) serializers.

In the benchmark we use Python 3.7, Django 2.1.1 and Django Rest Framework 3.9.4.

Simple Function

from typing import Dict, Any

from django.contrib.auth.models import User

def serialize_user(user: User) -> Dict[str, Any]:
return {
'last_login': user.last_login.isoformat() if user.last_login is not None else None,
'is_superuser': user.is_superuser,
'username': user.username,
'first_name': user.first_name,
'last_name': user.last_name,
'is_staff': user.is_staff,
'is_active': user.is_active,
'date_joined': user.date_joined.isoformat(),

Serializers are used for transforming data into objects, and objects into data. This is a simple function, so let’s write one that accepts a User instance, and returns a dict:

Create a user to use in the benchmark:

>>> from django.contrib.auth.models import User
>>> u = User.objects.create_user(
>>> username='hakib',
>>> first_name='haki',
>>> last_name='benita',
>>> email='',
>>> )

For our benchmark we are using . To eliminate external influences such as the database, we fetch a user in advance and serialize it 5,000 times:

>>> import cProfile
>>>'for i in range(5000): serialize_user(u)', sort='tottime')
15003 function calls in 0.034 seconds

Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
5000 0.020 0.000 0.021 0.000 {method 'isoformat' of 'datetime.datetime' objects}
5000 0.010 0.000 0.030 0.000
1 0.003 0.003 0.034 0.034 <string>:1(<module>)
5000 0.001 0.000 0.001 0.000
1 0.000 0.000 0.034 0.034 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}

The simple function took 0.034 seconds to serialize a User object 5,000 times.


Django Rest Framework (DRF) comes with a few utility classes, namely the ModelSerializer.

A ModelSerializer for the built-in User model might look like this:

from rest_framework import serializers

class UserModelSerializer(serializers.ModelSerializer):
class Meta:
model = User
fields = [

Running the same benchmark as before:

>>>'for i in range(5000): UserModelSerializer(u).data', sort='tottime')
18845053 function calls (18735053 primitive calls) in 12.818 seconds

Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
85000 2.162 0.000 4.706 0.000
7955000 1.565 0.000 1.565 0.000 {built-in method builtins.hasattr}
1080000 0.701 0.000 0.701 0.000
50000 0.594 0.000 4.886 0.000
1140000 0.563 0.000 0.581 0.000 {built-in method builtins.getattr}
55000 0.489 0.000 0.634 0.000
1240000 0.389 0.000 0.389 0.000 {built-in method builtins.setattr}
5000 0.342 0.000 11.773 0.002
20000 0.338 0.000 0.446 0.000 {built-in method builtins.__build_class__}
210000 0.333 0.000 0.792 0.000
75000 0.312 0.000 2.285 0.000
20000 0.248 0.000 4.817 0.000
1300000 0.230 0.000 0.264 0.000 {built-in method builtins.isinstance}
50000 0.224 0.000 5.311 0.000

It took DRF 12.8 seconds to serialize a user 5,000 times, or 390ms to serialize just a single user. That is 377 times slower than the plain function.

We can see that a significant amount of time is spent in ModelSerializer uses the lazy function from django.utils.functional to evaluate validations. It is also used by Django verbose names and so on, which are also being evaluated by DRF. This function seem to be weighing down the serializer.

Read Only ModelSerializer

Field validations are added by ModelSerializer only for writable fields. To measure the effect of validation, we create a ModelSerializer and mark all fields as read only:

from rest_framework import serializers

class UserReadOnlyModelSerializer(serializers.ModelSerializer):
class Meta:
model = User
fields = [
read_only_fields = fields

When all fields are read only, the serializer cannot be used to create new instances.

Let’s run our benchmark with the read only serializer:

>>>'for i in range(5000): UserReadOnlyModelSerializer(u).data', sort='tottime')
14540060 function calls (14450060 primitive calls) in 7.407 seconds

Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
6090000 0.809 0.000 0.809 0.000 {built-in method builtins.hasattr}
65000 0.725 0.000 1.516 0.000
50000 0.561 0.000 4.182 0.000
55000 0.435 0.000 0.558 0.000
840000 0.330 0.000 0.346 0.000 {built-in method builtins.getattr}
210000 0.294 0.000 0.688 0.000
5000 0.282 0.000 6.510 0.001
75000 0.220 0.000 1.989 0.000
1305000 0.200 0.000 0.228 0.000 {built-in method builtins.isinstance}
50000 0.182 0.000 4.531 0.000
50000 0.145 0.000 0.259 0.000
55000 0.133 0.000 0.696 0.000
50000 0.127 0.000 2.377 0.000
210000 0.119 0.000 0.145 0.000

Only 7.4 seconds. A 40% improvement compared to the writable ModelSerializer.

In the benchmark’s output we can see a lot of time is being spent in and These are related to the inner workings of the ModelSerializer. In the serialization and initialization process the ModelSerializer is using a lot of metadata to construct and validate the serializer fields, and it comes at a cost.

“Regular” Serializer

In the next benchmark, we wanted to measure exactly how much the ModelSerializer "costs" us. Let's create a "regular" Serializer for the User model:

from rest_framework import serializers

class UserSerializer(serializers.Serializer):
id = serializers.IntegerField()
last_login = serializers.DateTimeField()
is_superuser = serializers.BooleanField()
username = serializers.CharField()
first_name = serializers.CharField()
last_name = serializers.CharField()
email = serializers.EmailField()
is_staff = serializers.BooleanField()
is_active = serializers.BooleanField()
date_joined = serializers.DateTimeField()

Running the same benchmark using the “regular” serializer:

>>>'for i in range(5000): UserSerializer(u).data', sort='tottime')
3110007 function calls (3010007 primitive calls) in 2.101 seconds

Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
55000 0.329 0.000 0.430 0.000
105000/5000 0.188 0.000 1.247 0.000
50000 0.145 0.000 0.863 0.000
20000 0.093 0.000 0.320 0.000
310000 0.092 0.000 0.092 0.000 {built-in method builtins.getattr}
50000 0.087 0.000 0.125 0.000
5000 0.072 0.000 1.934 0.000
55000 0.055 0.000 0.066 0.000
5000 0.053 0.000 1.204 0.000
235000 0.052 0.000 0.052 0.000 {method 'update' of 'dict' objects}
50000 0.048 0.000 0.097 0.000
260000 0.048 0.000 0.075 0.000 {built-in method builtins.isinstance}
25000 0.047 0.000 0.051 0.000
55000 0.042 0.000 0.057 0.000
50000 0.041 0.000 0.197 0.000
5000 0.037 0.000 1.459 0.000

Here is the leap we were waiting for!

The “regular” serializer took only 2.1 seconds. That’s 60% faster than the read only ModelSerializer, and a whooping 85% faster than the writable ModelSerializer.

At this point it become obvious that the ModelSerializer does not come cheap!

Read Only “regular” Serializer

In the writable ModelSerializer a lot of time was spent on validations. We were able to make it faster by marking all fields as read only. The "regular" serializer does not define any validation, so marking fields as read only is not expected to be faster. Let's make sure:

from rest_framework import serializers

class UserReadOnlySerializer(serializers.Serializer):
id = serializers.IntegerField(read_only=True)
last_login = serializers.DateTimeField(read_only=True)
is_superuser = serializers.BooleanField(read_only=True)
username = serializers.CharField(read_only=True)
first_name = serializers.CharField(read_only=True)
last_name = serializers.CharField(read_only=True)
email = serializers.EmailField(read_only=True)
is_staff = serializers.BooleanField(read_only=True)
is_active = serializers.BooleanField(read_only=True)
date_joined = serializers.DateTimeField(read_only=True)

And running the benchmark for a user instance:

>>>'for i in range(5000): UserReadOnlySerializer(u).data', sort='tottime')
3360009 function calls (3210009 primitive calls) in 2.254 seconds

Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
55000 0.329 0.000 0.433 0.000
155000/5000 0.241 0.000 1.385 0.000
50000 0.161 0.000 1.000 0.000
310000 0.095 0.000 0.095 0.000 {built-in method builtins.getattr}
20000 0.088 0.000 0.319 0.000
50000 0.087 0.000 0.129 0.000
5000 0.073 0.000 2.086 0.000
55000 0.055 0.000 0.067 0.000
5000 0.054 0.000 1.342 0.000
235000 0.053 0.000 0.053 0.000 {method 'update' of 'dict' objects}
25000 0.052 0.000 0.057 0.000
260000 0.049 0.000 0.076 0.000 {built-in method builtins.isinstance}

As expected, marking the fields as readonly didn’t make a significant difference compared to the “regular” serializer. This reaffirms that the time was spent on validations derived from the model’s field definitions.

Results Summary

Here is a summary of the results so far:

Serializer                  | Seconds
UserModelSerializer | 12.818
UserReadOnlyModelSerializer | 7.407
UserSerializer | 2.101
UserReadOnlySerializer | 2.254
serialize_user | 0.034

Prior Work

A lot of articles were written about serialization performance in Python. As expected, most articles focus on improving DB access using techniques like select_related and prefetch_related. While both are valid ways to improve the overall response time of an API request, they don't address the serialization itself. I suspect this is because nobody expects serialization to be slow.

Other articles that do focus solely on serialization usually avoid fixing DRF, and instead motivate new serialization frameworks such as marshmallow and serpy. There is even a site devoted to comparing serialization formats in Python. To save you a click, DRF always comes last.

In late 2013, Tom Christie, the creator of Django Rest Framework, wrote an article discussing some of DRF’s drawbacks. In his benchmarks, serialization accounted for 12% of the total time spend on processing a single request. In the summary, Tom recommends to not always resort to serialization:

4. You don’t always need to use serializers.

For performance critical views you might consider dropping the serializers entirely and simply use .values() in your database queries.

As we see in a bit, this is solid advice.

Why is This Happening?

In the first benchmark using ModelSerializer we saw a significant amount of time being spent in, and more specifically in the function lazy.

Fixing Django’s lazy

The function lazy is used internally by Django for many things such as verbose names, templates etc. The source describes lazy as follows:

Encapsulate a function call and act as a proxy for methods that are called on the result of that function. The function is not evaluated until one of the methods on the result is called.

The lazy function does its magic by creating a proxy of the result class. To create the proxy, lazy iterates over all attributes and functions of the result class (and its super-classes), and creates a wrapper class which evaluates the function only when its result is actually used.

For large result classes, it can take some time to create the proxy. So, to speed things up, lazy caches the proxy. But as it turns out, a small oversight in the code completely broke the cache mechanism, making the lazy function very very slow.

To get a sense of just how slow lazy is without proper caching, let's use a simple function which returns an str (the result class), such as upper. We choose str because it has a lot of methods, so it should take a while to set up a proxy for it.

To establish a baseline, we benchmark using str.upper directly, without lazy:

>>> import cProfile
>>> from django.utils.functional import lazy
>>> upper = str.upper
>>>'''for i in range(50000): upper('hello') + ""''', sort='cumtime')

50003 function calls in 0.034 seconds

Ordered by: cumulative time

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.034 0.034 {built-in method builtins.exec}
1 0.024 0.024 0.034 0.034 <string>:1(<module>)
50000 0.011 0.000 0.011 0.000 {method 'upper' of 'str' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}

Now for the scary part, the exact same function but this time wrapped with lazy:

>>> lazy_upper = lazy(upper, str)
>>>'''for i in range(50000): lazy_upper('hello') + ""''', sort='cumtime')

4900111 function calls in 1.139 seconds

Ordered by: cumulative time

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 1.139 1.139 {built-in method builtins.exec}
1 0.037 0.037 1.139 1.139 <string>:1(<module>)
50000 0.018 0.000 1.071 0.000
50000 0.028 0.000 1.053 0.000
50000 0.500 0.000 1.025 0.000
4600000 0.519 0.000 0.519 0.000 {built-in method builtins.hasattr}
50000 0.024 0.000 0.031 0.000
50000 0.006 0.000 0.006 0.000 {method 'mro' of 'type' objects}
50000 0.006 0.000 0.006 0.000 {built-in method builtins.getattr}
54 0.000 0.000 0.000 0.000 {built-in method builtins.setattr}
54 0.000 0.000 0.000 0.000
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}

No mistake! Using lazy it took 1.139 seconds to turn 5,000 strings uppercase. The same exact function used directly took only 0.034 seconds. That is 33.5 faster.

This was obviously an oversight. The developers were clearly aware of the importance of caching the proxy. A PR was issued, and merged shortly after (diff here). Once released, this patch is supposed to make Django overall performance a bit better.

Fixing Django Rest Framework

DRF uses lazy for validations and fields verbose names. When all of these lazy evaluations are put together, you get a noticeable slowdown.

The fix to lazy in Django would have solved this issue for DRF as well after a minor fix, but nonetheless, a separate fix to DRF was made to replace lazy with something more efficient.

To see the effect of the changes, install the latest of both Django and DRF:

(venv) $ pip install git+
(venv) $ pip install git+

After applying both patches, we ran the same benchmark again. These are the results side by side:

Serializer                  | Before | After | % Change
UserModelSerializer | 12.818 | 5.674 | -55%
UserReadOnlyModelSerializer | 7.407 | 5.323 | -28%
UserSerializer | 2.101 | 2.146 | +2%
UserReadOnlySerializer | 2.254 | 2.125 | -5%
serialize_user | 0.034 | 0.034 | 0%

To sum up the results of the changes to both Django and DRF:

  • Serialization time for writable ModelSerializer was cut by half.
  • Serialization time for a read only ModelSerializer was cut by almost a third.
  • As expected, there is no noticeable difference in the other serialization methods.


Our takeaways from this experiment were:

Upgrade DRF and Django once these patches make their way into a formal release.

Both PR’s were merged but not yet released.

In performance critical endpoints, use a “regular” serializer, or none at all.

We had several places where clients were fetching large amounts or data using an API. The API was used only for reading data from the server, so we decided to not use a Serializer at all, and inline the serialization instead.

Serializer fields that are not used for writing or validation, should be read only.

As we’ve seen in the benchmarks, the way validations are implemented makes them expensive. Marking fields as read only eliminate unnecessary additional cost.

Bonus: Forcing Good Habits

To make sure developers don’t forget to set read only fields, we added a Django check to make sure all ModelSerializers set read_only_fields:

# common/

import django.core.checks

def check_serializers(app_configs, **kwargs):
import inspect
from rest_framework.serializers import ModelSerializer
import conf.urls # noqa, force import of all serializers.

for serializer in ModelSerializer.__subclasses__():

# Skip third-party apps.
path = inspect.getfile(serializer)
if path.find('site-packages') > -1:

if hasattr(serializer.Meta, 'read_only_fields'):

yield django.core.checks.Warning(
'ModelSerializer must define read_only_fields.',
hint='Set read_only_fields in ModelSerializer.Meta',

With this check in place, when a developer adds a serializer she must also set read_only_fields. If the serializer is writable, read_only_fields can be set to an empty tuple. If a developer forgets to set read_only_fields, she gets the following error:

$ python check
System check identified some issues:

<class 'serializers.UserSerializer'>: (H300) ModelSerializer must define read_only_fields.
HINT: Set read_only_fields in ModelSerializer.Meta

System check identified 1 issue (4 silenced).

We use Django checks a lot to make sure nothing falls through the cracks. You can find many other useful checks in this article about how we use the Django system check framework.

Originally published at on June 8, 2019.




Full Stack Developer, Team Leader, Independent. More from me at

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

5 Best Courses + Practice Tests for AWS DevOps Engineer Certification Exam in 2022

5 Best Courses for AWS DevOps Engineer Certification Exam

How to auto-sync update from one Github repository to other repository using Github Workflow

How to set up idempotence for HTTPD restart service

Superfluid Sponsoring ETHAmsterdam 2022 Hackathon

Milestone Announcement: Meet Trilobite, Testnet V0.2 Release

Wanna contribute to OpenSource ???

What Is Software Testing? Typical Objectives of Testing

Model View Controller off the Rails

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Haki Benita

Haki Benita

Full Stack Developer, Team Leader, Independent. More from me at

More from Medium

How to build a JSON formatted resume API with Django, and ElephantSQL

Get start Django project, and its files and directories structure

The Python and Django conferences you don’t want to miss in 2022

How to Make a Webhook Receiver in Django