Python Memory

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 83

Memory Management in

Python
Nina Zakharenko - @nnja

slides: bit.ly/nbpy-memory
Livetweet!
use #nbpy
@nnja
Why should you care?
Knowing about memory management
helps you write more efficient code.

@nnja
What will you learn?
4 Vocabulary
4 Basic Concepts
4 Foundation

@nnja
What won't you learn?
You won’t be an expert at the end of
this talk.

@nnja
What's a variable?
@nnja
What's a c-style variable?
Change value of c-style variables
Python has
names
not
variables
@nnja
How are Python objects
stored in memory?
names ➡ references ➡ objects

@nnja
A name is just a label for an
object.
In Python, each object can have lots of
names.

Like 'x', 'y'


@nnja
Different Types of Objects

Simple Container

numbers dict

strings list

user defined classes

Container objects can contain simple objects, or other


container objects.

@nnja
What's a Reference?
A name or a container object that
points at another object.

@nnja
Reference Count
@nnja
Increasing the ref count

@nnja
➡ Decreasing the ref count

@nnja
Decrease Ref Count: Change the Reference
Decrease Ref Count: del keyword
What does del do?
The del statement doesn't delete objects.

It:

4 removes that name as a reference to that object


4 reduces the ref count by 1

@nnja
Decrease Ref Count: Go out of Scope
!

When there are no more references,


the object can be safely removed from
memory.

@nnja
local vs. global namespace
If refcounts decrease when an object goes out of scope,
what happens to objects in the global namespace?
4 Never goes out of scope!
4 Refcount never reaches 0.
4 Avoid putting large or complex objects in the global
namespace.

@nnja
Every Python object holds 3 things

4 Its type
4 A reference count
4 Its value

@nnja
>>> def mem_test():
... x = 300
... y = 300
... print( id(x) )
... print( id(y) )
... print( x is y )

>>> mem_test()
4504654160
4504654160
True

ℹ note: run this from a function in the repl, or from a file

@nnja
Garbage
Collection
@nnja
What is Garbage
Collection?
A way for a program to automatically
release memory when the object taking
up that space is no longer in use.
@nnja
Two Main Types of Garbage
Collection
1. Reference Counting
2. Tracing
@nnja
How does reference counting garbage
collection work?
1. Add and remove references
2. When the refcount reaches 0, remove the object
3. Cascading effect
4 decrease ref count of any object the deleted
object was pointing to

@nnja
Reference Counting Garbage Collection:
The Good
4 Easy to implement
4 When refcount is 0, objects are immediately
deleted.

@nnja
Reference Counting:
The Bad
4 space overhead
4 reference count is stored for every object
4 execution overhead
4 reference count changed on every assignment

@nnja
Reference Counting:
The Ugly
Not generally thread safe!

Reference counting doesn't detect


cyclical references
@nnja
Cyclical References By Example

@nnja
What's a cyclical reference?

@nnja
Cyclical Reference

@nnja
Reference counting alone
will not garbage collect
objects with cyclical
references.
@nnja
Two Main Types of Garbage
Collection
1. Reference Counting
2. Tracing
@nnja
Tracing Garbage Collection - Marking
Tracing Garbage Collection - Sweeping
What does Python use?
Reference Counting &
Generational
(A type of Tracing GC)

@nnja
Generational Garbage
Collection is based on the
theory that most objects
die young.
@nnja
Python maintains a list of every object
created as a program is run.
Actually, it makes 3:
- generation 0
- generation 1
- generation 2

Newly created objects are stored in generation 0.

@nnja
Only container objects
with a refcount greater
than 0 will be stored in a
generation list.
@nnja
When the number of objects in a
generation reaches a threshold, python
runs a garbage collection algorithm on
that generation, and any generations
younger than it.

@nnja
What happens during a generational garbage
collection cycle?
1. Python makes a list for objects to discard.
2. It runs an algorithm to detect reference cycles.
3. If an object has no outside references, add it to the discard
list.
4. When the cycle is done, free up the objects on the discard
list.

@nnja
After a garbage collection cycle, objects
that survived will be promoted to the
next generation.

Objects in the last generation (2) stay


there as the program executes.

@nnja
When the ref count reaches 0, you get
immediate clean up.

If you have a cycle, you need to wait


for garbage collection to run.

@nnja
Objects with cyclical references get
cleaned up by generational garbage
collection.

@nnja

Why doesn’t a Python
program shrink in memory
after garbage collection?

@nnja
After garbage collection, the size of
the python program likely won’t
shrink.
4 The freed memory is fragmented.
4 i.e. it's not freed in one continuous block.
4 When we say memory is freed during garbage collection, it’s released
back to Python to use for other objects, not necessarily to the system.

@nnja
Quick Optimizations

@nnja
__slots__
@nnja
Python instances have a dict of values
class Dog(object):
pass

buddy = Dog()
buddy.name = 'Buddy'

print(buddy.__dict__)
{'name': 'Buddy'}

@nnja
AttributeError
'Hello'.name = 'Fred'

AttributeError
Traceback (most recent call last)
----> 1 'Hello'.name = 'Fred'

AttributeError: 'str' object has no attribute 'name'

@nnja
__slots__
class Point(object):
__slots__ = ('x', 'y')

point = Point()
point.x = 5
point.y = 7

point.name = "Fred"

Traceback (most recent call last):


File "point.py", line 8, in <module>
point.name = "Fred"
AttributeError: 'Point' object has no attribute 'name'

@nnja
size of dict vs. size of tuple
import sys

sys.getsizeof(dict())
sys.getsizeof(tuple())

sizeof dict: 232 bytes


sizeof tuple: 40 bytes
@nnja
When to use slots?
4 Creating many instances of a class
4 Know in advance what properties the class should
have

Saving 9 GB of RAM with __slots__


weakref
4 A weakref to an object is not enough to keep it alive.
4 When the only remaining references are weak
references, the object can be garbage collected.
4 Useful for:
4 implementing caches or mappings holding large
objects

python3 weakref docs


What's
a
@nnja
GIL?
Global
Interpreter
@nnja
Lock
Only one thread can run in
the interpreter at a time.

@nnja
Advantages / Disadvantages of a GIL
Upside:
Reference counting is fast and easy to implement.

Downside:
In a Python program, no matter how many threads
exist, only one thread will be executed at a time.
Want to take advantage of multiple
cores?
4 Use multi-processing instead of multi-threading.
4 Each process will have it’s own GIL, it’s on the
developer to figure out a way to share information
between processes.

@nnja

If the GIL limits Python,
can’t we just remove it?

additional reading
For better or for worse, the GIL is here
to stay!

@nnja
What Did We Learn?

@nnja
Garbage collection is pretty
good.

@nnja
Now you know how
memory is managed.

@nnja
Python3!

@nnja
For scientific applications,
use numpy & pandas.

@nnja
Thank You!

Python @ Microsoft:
bit.ly/nbpy-microsoft

@nnja

*Bonus material on the next slide


Bonus Material
Section ➡
@nnja
Additional Reading
4 Great explanation of generational garbage collection
and python’s reference detection algorithm
4 Weak Reference Documentation
4 Python Module of the Week - gc
4 PyPy STM - GIL less Python Interpreter
4 Saving 9GB of RAM with python’s __slots__

@nnja
Getting in-depth with the GIL
4 Dave Beazley - Guide on how the GIL Operates
4 Dave Beazley - New GIL in Python 3.2
4 Dave Beazley - Inside Look at Infamous GIL Patch

@nnja
Why can’t we use the REPL to follow
along at home?
4 Because It doesn’t behave like a typical python
program that’s being executed.
4 Further reading

@nnja
Python pre-loads objects
4 Many objects are loaded by Python as the interpreter starts.
4 Called peephole optimization.
4 Numbers: -5 -> 256
4 Single Letter Strings
4 Common Exceptions
4 Further reading

@nnja
Attempting to remove the Gil - A
Gilectomy
4 Larry Hastings - Removing Python's GIL - The
Gilectomy
4 Larry Hastings - The Gilectomy, How it's going
4 Gilectomy on GitHub
4 A Gilectomy Update

@nnja
weakref
4 weakref Python Module of the week
4 weakref documentation

@nnja
@nnja

You might also like