Object Pooling in CPython: How It Works and Why It Matters

Object Pooling in CPython: How It Works and Why It Matters

Understanding the NSMALLPOSINTS and NSMALLNEGINTS Constants

As programmers, we are constantly looking for ways to optimize our code and make it more efficient. One technique that can help us achieve this goal is object pooling, which is a way of reducing the number of objects that are created and the amount of memory that is used by reusing a single object to represent multiple values.

Object pooling is implemented in many programming languages, including the CPython implementation of Python, which is the most widely used Python interpreter.

This article will be explaining the basics of object creation in CPython which is inspired by one of the advanced tasks at ALX, and it aims to provide a comprehensive overview of object pooling in CPython for my cohorts and other interested readers. Let's get started ...


Primarily, we will focus specifically on the object pooling mechanism for integers in CPython, which is implemented using the NSMALLPOSINTS and NSMALLNEGINTS constants. These constants define the range of integer values that are stored in the object pool, and they play a crucial role in determining how object pooling works for integers in CPython.

Memory Optimization

"Optimization is about finding the best solution, not the easiest one." -Unknown

Object pooling in CPython is implemented using a mechanism called "interning," which is used to reduce the number of objects that are created and the amount of memory that is used by reusing a single object to represent multiple values. Interning is used primarily for immutable objects such as strings, but it is also used for some other types of objects, such as integers and tuples.

In CPython, interning is implemented using a string interning mechanism, which means that a single string object is used to represent multiple string literals that have the same value. When the Python interpreter encounters a string literal, it checks to see if an object with the same value already exists in the intern pool. If it does, the interpreter returns a reference to the existing object, rather than creating a new object.

For integers, object pooling in CPython is implemented using the NSMALLPOSINTS and NSMALLNEGINTS constants, which define the range of integer values that are stored in the object pool (usually an array of 262 integers). These constants are used to determine whether a new int object should be created or an existing object from the object pool should be used. This structuring is basically used to access these integers fast.

NSMALLPOSINTS And NSMALLNEGINTS Constants

The NSMALLPOSINTS constant represents the range of positive integers that are stored in the object pool, while the NSMALLNEGINTS constant represents the range of negative integers that are stored in the object pool.

In a CPython implementation of Python3 with default options/configuration, NSMALLPOSINTS is equal to 256 and NSMALLNEGINTS is equal to 5, which means that the object pool contains integer objects for the values from -5 to 256.

#define NSMALLPOSINTS           257
#define NSMALLNEGINTS           5

This means that when you create an int object with a value that is within this range, the Python interpreter will not create a new object, but will instead return a reference to the existing object that is stored in the object pool.

Here are a few examples of how object pooling is used for integers in the CPython implementation of Python:

NOTE: The is operator compares the id’s (or memory locations) of two objects and returns True if they are the same.

  • Object pooling for positive integers:
>>> NSMALLPOSINTS = 256
>>> a = 99
>>> b = 99
>>> a is b
True

In this example, the NSMALLPOSINTS constant is set to 256, which means that all positive integers in the range of 1 to 256 are stored in the object pool. When the variables a and b are both assigned the value 99, the Python interpreter returns a reference to the same object from the object pool for both variables, rather than creating a new object. This is why the is operator returns True when a and b are compared.

  • Object pooling for negative integers:
>>> NSMALLNEGINTS = 5
>>> c = -2
>>> d = -2
>>> c is d
True

In this example, the NSMALLNEGINTS constant is set to 5, which means that all negative integers in the range of -5 to -1 are stored in the object pool. When the variables c and d are both assigned the value -2, the Python interpreter returns a reference to the same object from the object pool for both variables, rather than creating a new object. This is why the is operator returns True when c and d are compared.

  • Object pooling for integers outside the range defined by NSMALLPOSINTS and NSMALLNEGINTS:
>>> a = 257
>>> b = 257
>>> a is b
False

In this example, the variables a and b are both assigned the value 257, which is outside the range defined by the NSMALLPOSINTS constant. As a result, the Python interpreter creates a new int object for each variable, rather than returning a reference to an existing object from the object pool. This is why the is operator returns False when a and b are compared.

And lastly, a tricky one:

>>> print("I")
>>> print("Love")
>>> print("Python")

With these lines of code and assuming we are using a CPython implementation of Python3 with default options/configuration, Before the execution of line 2 (print("Love")), how many int objects have been created and are still in memory?

Hope you guessed it right :), the correct answer will be 262. Wondering why?

Well since the question asks how many int objects were created and still in memory, whereas, the string literal "I" is not an int object, and the print() function does not create any int objects. Therefore, the execution of the first line of the script print("I") does not create any new int objects. But there is an existing object in memory, which is an array of 262 integers (257 + 5), hence the answer.

Why It Matters?

Object pooling is an optimization technique that is used to reduce the number of objects that are created and the amount of memory that is used by reusing a single object to represent multiple values. This can improve the performance of a Python program by reducing the overhead of object creation and destruction.


As good as it may seem, it is also important to note that object pooling is just one of many optimization techniques and there are many other resources available for those who want to learn more about optimization in Python. Some possible further reading on this topic includes the official Python documentation on performance optimization, as well as articles and tutorials on topics such as object interning and garbage collection -- if you want more information I’ll link to several resources below.

I hope that this seed of knowledge grows deep roots and yields much fruit for you as you continue to learn and grow in your Python journey. Thanks for reading.

Happy coding!

Resources 🔗


To read more such interesting articles on Python and Data Science, subscribe to my blog https://bolexzy.hashnode.dev/. You can also reach me on LinkedIn.