Scaling Django Admin Date Hierarchy
How we got to have the cake and eat it too!
For a better reading experience, check out this article on my website.
TL;DR
We published a package called django-admin-lightweight-date-hierarchy which overrides Django Admin date_hierarchy template tag and eliminates all database queries from it. For the implementation details and the shocking performance analysis read on.
If you are not familiar with Django Admin date_hierarchy you should — it’s great. Set the date_hierarchy
attribute of a ModelAdmin
to a DateField
:
from django.contrib import admin
from .models import Sale@admin.register(Sale)
class SaleAdmin(admin.ModelAdmin):
date_hierarchy = 'created'
And you get a nice drill-down menu at the top of the admin change list:
When selecting a year, Django will filter the data to the selected year, and present a list of months for which there is data in that year.
When selecting a month, Django will apply the filter and present the list of days for which there is data in that month.
date_hierarchy behind the scenes
To produce a list of dates for which there is data, Django has to perform a query. For example, to produce the list of years, the first level in the hierarchy, Django will execute the following query:
SELECT DISTINCT
django_datetime_trunc('year', "sales_sale"."created", 'UTC') AS "datetimefield"
FROM
"sales_sale"
WHERE
"sales_sale"."created" IS NOT NULL
ORDER BY
"datetimefield" ASC;
When a year is selected, Django will execute a query to produce the next level in the hierarchy — months:
SELECT DISTINCT
django_datetime_trunc('month', "sales_sale"."created", 'UTC') AS "datetimefield"
FROM
"sales_sale"
WHERE
"sales_sale"."created" BETWEEN '2017–01–01 00:00:00' AND '2017–12–31 23:59:59.999999'
AND "sales_sale"."created" BETWEEN '2017–01–01 00:00:00' AND '2017–12–31 23:59:59.999999'
AND "sales_sale"."created" IS NOT NULL
ORDER BY
"datetimefield" ASC;
How expensive is it?
The date hierarchy is a great feature but it comes at a price.
To illustrate the problem we created a simple Sale
model with two fields — id and created, and populated it with ~1,000,000 rows.
Using django-admin-toolbar we can see how long it takes Django to produce the date hierarchy:
WOW! The page took a boggling ~8s to load, out of which 7.6 seconds are spent producing the date hierarchy!
Just for comparison, the exact same page without date_hierarchy:
17ms. That’s about 99.8% better.
A possible solution
Looking at the chart above it’s clear that we have a problem. The date hierarchy is weighing our page making it nearly unusable.
Django need to execute a query because it only wants to show dates for which there is data. In our case, we have sales every day. Once we make this assumption we no longer have to query the data to produce a list of dates — we can just show them all!
The idea we came up with is:
- if the user selected a month we show all of the days in the month.
- If the user selected a year we show all of the months in the year.
- If the user selected nothing we need to make an additional assumption — in our case we decided to show +-3 years from the current year. This is a compromise we were willing to make for the sake of performance and usability.
Now that we have the general idea let’s dive into the implementation.
Implementation
Looking at the output above we can see that the queries originate from a template tag called {% date_hierarchy cl %}
. The argument cl
is the ChangeList
created by the ModelAdmin
.
The implementation for the date_hierarchy
template tag can be found at django/contrib/admin/templatetags/admin_list.py. The interesting part is where the queries are executed.
Let’s take a look at how Django produces a list of months for a given year:
…
year_field = ‘%s__year’ % field_name
…
year_lookup = cl.params.get(year_field)def link(filters):
return cl.get_query_string(filters, [field_generic])…elif year_lookup:
months = cl.queryset.filter(**{year_field: year_lookup})
months = getattr(months, ‘dates’)(field_name, ‘month’)
return {
‘show’: True,
‘back’: {
‘link’: link({}),
‘title’: _(‘All dates’),
},
‘choices’: [{
‘link’: link({
year_field: year_lookup,
month_field: month.month,
}),
‘title’: capfirst(formats.date_format(month, ‘YEAR_MONTH_FORMAT’))
} for month in months]
}
…
Our date_hierarchy
is set to the created
. If we drill-down on year 2017 we get the following URL:
http://localhost:8000/admin/sales/sale/?created__year=2017
In the template tag year_field
is created__year
and year_lookup
is 2017. The generated query applies the filter on created
and fetches a list of months ther is data for in year 2017 to the variable months
.
Let’s replace this bit and populate months
with a list of all the months in the year instead:
# months = cl.queryset.filter(**{year_field: year_lookup})
# months = getattr(months, ‘dates’)(field_name, ‘month’)# All months of selected year.
months = (
datetime.date(int(year_lookup), month, 1)
for month in range(1, 13)
)
After the change:
Awesome! No queries.
And the list view:
Our little change worked! All the months are displayed and no queries are executed by the date hierarchy.
Let’s do the same for days:
# days = cl.queryset.filter(**{year_field: year_lookup, month_field: month_lookup})
# days = getattr(days, dates_or_datetimes)(field_name, ‘day’)# All days of month.days_in_month = calendar.monthrange(int(year_lookup), int(month_lookup))[1]
first_day_of_month = datetime.date(int(year_lookup), int(month_lookup), 1)days = (
first_day_of_month + datetime.timedelta(days=i)
for i in range(days_in_month)
)
We use the calendar
module to find out how many days there are in a given month.
Let’s handle the years. Remember, we fetch +-3 years from today:
# years = getattr(cl.queryset, dates_or_datetimes)(field_name, ‘year’)# Three years in each direction.
today = get_today()
years = (
datetime.date(y, 1, 1)
for y in range(today.year — 3, today.year + 3 + 1)
)
Integration
Up until now we fiddled with Django’s source but we can’t really do that. First we override the template tag to make Django use our implementation.
Let’s copy the function and register a template with the same name:
# app/templatetags/admin_list.pyfrom django.contrib.admin.templatetags.admin_list import register@register.inclusion_tag(‘admin/date_hierarchy.html’)
def date_hierarchy(cl):
…(original implementation)…
Register the library in our app:
# settings.pyTEMPLATES = [{
…
‘OPTIONS’: {
…
‘libraries’: {
…
‘admin’: ‘app.templatetags.admin_list’,
},
},
}]
Now Django uses our template tag instead of his.
The problem with producing the date hierarchy is really an issue only for very large tables. We don’t want to disable the existing behavior — we want to enable it only for very large tables.
Let’s add an attribute on the ModelAdmin
to turn the default drill-down behavior on and off:
@admin.register(Sale)
class SaleAdmin(admin.ModelAdmin):
date_hierarchy = ‘created’
date_hierarchy_drilldown = False
When date_hierarchy_drilldown is set to False our new template tag will not execute queries. Otherwise, we preserve the original behavior.
To implement this we add the following at the start of our implementation of the date_hierarchy template tag:
date_hierarchy_drilldown = getattr(
cl.model_admin,
‘date_hierarchy_drilldown’,
True
)
Now we can re-enable the default behavior when date_hierarchy_drilldown=True
.
For example, producing a list of months:
…if date_hierarchy_drilldown:
months = cl.queryset.filter(**{year_field: year_lookup})
months = getattr(months, dates_or_datetimes)(field_name, ‘month’)else:
# All months of selected year.
months = (
datetime.date(int(year_lookup), month, 1)
for month in range(1, 13)
)…
This is it! We’ve successfully scaled date hierarchy to handle millions of rows with a little compromise on UX.
Package
We found this approach useful in several projects so we decided to publish it as a package:
Using it is as simple as
Install it
pip install django-admin-lightweight-date-hierarchy
Add it to your INSTALLED_APPS:
INSTALLED_APPS = (
…
'django_admin_lightweight_date_hierarchy',
…
)
Set date_hierarchy_drilldown to False on any ModelAdmin with date_hierarchy to prevent the default drill-down behavior:
@admin.register(MyModel)
class MyModelAdmin(admin.ModelAdmin):
date_hierarchy = 'created'
date_hierarchy_drilldown = False
To squeeze some more juice out of Django Admin check out this post as well:
Cheers!