Professional Documents
Culture Documents
Basic Performance Optimization in Django by Ryley Sill Medium
Basic Performance Optimization in Django by Ryley Sill Medium
=(.'*.4).(< >?(5'25'4??
!"#$"%&'##
()*'+%,$-./-0)1+$
!"#$%&#"#'()*
234'0'5)4'/1%'1%67)18/
@"')(*?"5<'."'.A2*'*.")07
>?(5'25'4??
B(.'.A('C)(('D(<2-;'4??E
+,#-./,0"1
The sections below are high-level
+5-<((P'1A(..0
explanations
"Q()'%'0(4)'4B" meant to expose you to
2"33,&4'53/13"6'7'"8/0$."'6/3/
0%6".
class Library(models.Model):
name =
models.CharField(max_length=20
0, default='')
address =
models.CharField(max_length=20
0, default='')
class Author(models.Model):
name =
models.CharField(max_length=20
0, default='')
class Book(models.Model):
library =
models.ForeignKey(
Library,
on_delete=models.CASCADE,
related_name='books',
)
author =
models.ForeignKey(
Author,
on_delete=models.CASCADE,
related_name='books'
)
title =
models.CharField(max_length=20
0, default='')
address =
models.CharField(max_length=20
0, default='')
def get_page_count(self):
return
self.pages.count()
class Page(models.Model):
book = models.ForeignKey(
Book,
on_delete=models.CASCADE,
related_name='pages',
)
text =
models.TextField(null=True,
blank=True)
page_number =
models.IntegerField()
9#"'.,&":$1%;,."1
If we don’t know why our code is slow its
going to be diOcult to %gure out how to
optimize it. line_profiler is a cool
python module that tells us how much
time it takes to execute each line in a
function. Before you get started, install
the package with pip install
line_profiler .
def get_books_by_library_id():
libraries =
Library.objects.all()
result = {}
return result
def
books_by_library_id_view(reque
st):
books_by_library_id =
get_books_by_library_id()
...
return
HttpResponse(response)
def
books_by_library_id_view(reque
st):
from IPython import embed;
embed()
books_by_library_id =
get_books_by_library_id()
...
return
HttpResponse(response)
your_function_name
<=1&'%&'5>?'.%44,&4
When we’re really digging into a function
to see where the bottlenecks are it can be
useful to turn on SQL logging.
# settings.py
LOGGING = {
'version': 1,
'filters': {
'require_debug_true':
{
'()':
'django.utils.log.RequireDebug
True',
}
},
'handlers': {
'console': {
'level': 'DEBUG',
'filters':
['require_debug_true'],
'class':
'logging.StreamHandler',
}
},
'loggers': {
'django.db.backends':
{
'level': 'DEBUG',
'handlers':
['console'],
}
}
}
@A%,6'B="1,"#',&'.%%$#
def get_books_by_library_id():
libraries =
Library.objects.all()
result = {}
return result
Back to our terminally ill function
get_books_by_library . The problem with
def
get_books_by_library_id_one_qu
ery():
books = Book.objects.all()
result = defaultdict(list)
result[book.library_id].append
(book)
return result
Boom. Now you’re only making one SQL
query regardless of how many libraries
exist in your database.
In [12]:
timeit(get_books_by_library_id
, number=10)
Out[12]: 6.598360636999132
In [13]:
timeit(get_books_by_library_id
_one_query, number=10)
Out[13]: 0.677092163998168
def get_books_by_author():
books = Book.objects.all()
result = defaultdict(list)
result[book.library_id].append
(title_and_author)
return result
Here’s the problem: each time you access
book.author you’re making a query
equivalent to
Author.objects.get(id=book.author_id) .
def
get_books_by_author_select_rel
ated():
books =
Book.objects.all().select_rela
ted('author')
result = defaultdict(list)
result[book.library_id].append
(title_and_author)
return result
In [12]:
timeit(get_books_by_author,
number=10)
Out[12]: 41.363460485998075
In [13]:
timeit(get_books_by_author_sel
ect_related, number=10)
Out[13]: 1.2787263889913447
$1";"3-C:1"./3"6(*
prefetch_related is similar to
select_related in that it prevents
unnecessary SQL queries. Instead of
fetching the primary and related objects in
one go, prefetch_related makes separate
queries for each relationship and “joins”
the results together in python. The
downside of this approach is it requires
multiple round trips to the database.
Author.objects.filter(name__st
artswith='R').prefetch('books'
)
How it works: %rst a request is %red oJ
which runs the primary query
Author.objects.filter(name__startswith
to:
Book.objects.filter(author_id__in=PKS_
OF_AUTHORS_FROM_FIRST_REQUEST) is
executed and both of the responses are
merged together into an Author queryset
that has each author’s books cached in
memory. So you end up with a similar
result as with select_related but you
arrive there through diJerent means.
A/.="#(*'/&6'A/.="#:.,#3(*
The time it takes to serialize SQL
responses into python scales with both the
number of rows and columns being
returned. In the function below, we’re
serializing all of the %elds de%ned on the
book and author even though we only
need the author’s name and the book’s
library_id and title . We’re also are
initializing a Django model instance for no
reason since we’re not doing anything
special with it (like calling model
methods).
def
get_books_by_author_select_rel
ated():
books =
Book.objects.all().select_rela
ted('author')
result = defaultdict(list)
result[book.library_id].append
(title_and_author)
return result
.select_related('author')
.values('title',
'library_id', 'author__name')
)
result = defaultdict(list)
for book in
books.iterator():
title_and_author = '{}
by {}'.format(
book['title'],
book['author__name']
)
result[book['library_id']].app
end(title_and_author)
return result
def
get_books_by_author_select_rel
ated_values_list():
books = (
Book.objects
.all()
.select_related('author')
.values_list('title',
'library_id', 'author__name')
)
result = defaultdict(list)
for book in
books.iterator():
title_and_author = '{}
by {}'.format(
book[0],
book[2]
)
result[book[1]].append(title_a
nd_author)
return result
.select_related('author')
)
.select_related('author')
.values()
)
# returns a list of
dictionaries with the name of
each book
def
get_book_dictionaries_title_on
ly():
return list(
Book.objects
.all()
.select_related('author')
.values('title')
)
In [64]:
timeit(get_book_instances,
number=100)
Out[64]: 12.904168864974054
In [65]:
timeit(get_book_dictionaries,
number=100)
Out[65]: 2.049193776998436
In [66]:
timeit(get_book_dictionaries_t
itle_only, number=100)
Out[66]: 1.4734381759772077
D=.E:-1"/3"(*
This one is pretty simple. If we’re going to
create more than one object at a time, use
bulk_create instead of creating the
objects in a loop. As the name suggests,
bulk_create will insert a list of objects
into the database using one query,
regardless of how many objects we’re
inserting.
from there.
5>?',#'(4"&"1/..F*';/#3"1'3C/&'GF3C%&
Let’s imagine you wanted a function that
returns the total number of pages for each
library. Using what we’ve learned above
you might end up with something like
this:
def
get_page_count_by_library_id()
:
result = defaultdict(int)
books =
Book.objects.all().prefetch_re
lated('pages')
result[book.library_id] +=
book.get_page_count()
return result
def
get_page_count_by_library_id_u
sing_annotation():
result = {}
libraries = (
Library.objects
.all()
.annotate(page_count=Sum('book
s__pages'))
.values_list('id',
'page_count')
)
return result
In [66]:
timeit(get_page_count_by_libra
ry_id, number=10)
Out[66]: 158.0743614450039
In [67]:
timeit(get_page_count_by_libra
ry_id_using_annotation,
number=10)
Out[67]: 1.3725216790044215
This example is pretty simple but there’s a
lot you can do with annotations. If you’re
running into performance issues while
doing math on a large queryset, consider
writing an annotation to omoad that work
to the database.
H1/$$,&4'=$
If you’re working on a Django-based
project where there has been little or no
thought regarding performance, there
should be quite a bit of low hanging fruit
to get you started. In the majority of cases,
the techniques described above will
provide a substantial performance boost
without much of a trade oJ in terms of
readability.
8ERS T
1"C.$4)('F5B25(()'U
V4).4
I%1"'J1%0'I"6,=0
K%L'3%'M1"/3"'+F&/0,-
N/-E41%=&6#'H,3C'3C"'M55
G/,&3'@GO
!()(5H'+#;4*2'25'W(..()'N)"B)4;;25B
<"..,&4'</."# 7 <"-C&,-/.
-C/.."&4"#'%;'D=,.6,&4'/&
,&3"1/-3,A"'0/$
34;(*'W(*.'25'=)4Q2.0$(##'XS
K%L'3%'"6,3'GP<KQRG@<K'%&
H,&6%L#
J4#04'=4).I;45'25'C)((V"<(V4;?E")B
@./&'5=4/1S'IF'TUVP"/1'5"/1-C
;%1'3C"'<1=3C'2%"#'%&
W4))0'V"##25*'25'@A('1.4).-?
WX'!=DF'%&'!/,.#'O&3"1A,"L
>="#3,%&#'/&6'@&#L"1#
VA)2*'OE'25'/-,0'J42#0
9#,&4';%10:3/4'/&6'$/1/0#'3%
-1"/3"'/'#"/1-C'%1'1"&6"1'/
#%13"6'A,"L',&'!/,.#
@)4H2('D4*(Y
@D=#,&4'MYY'3C1"/6#'3%'."/1&
C%L'3C"F'L%1E
H"<(Z2Z<"
QA"1-%0,&4'!=DF'Z11%1
I"##/4"#
/"52'1A4,"
=(.'.A('D(<2-;'4??