9 Tips to Make Your Python Code Run Faster!

Time: Column:Python views:273

In discussions about programming languages, you often hear the complaint that "Python is too slow," which often overshadows many of Python's strengths. In reality, if you write Pythonic code, Python can actually be very fast.

Details make the difference. Experienced Python developers use a variety of subtle but powerful techniques that can significantly improve performance. While these tips may seem minor at first glance, they can lead to huge gains in efficiency. Let’s dive into 9 methods that will change how you write and optimize Python code.

1. Faster String Concatenation: Choose "join()" Over "+"

When working with a lot of strings, concatenation can become a bottleneck in your Python programs. In Python, there are basically two ways to concatenate strings:

  1. Using join() to merge a list of strings

  2. Using + or += to add each string to another string

Which method is faster? Let's define three different functions to concatenate the same strings:

mylist = ["Yang", "Zhou", "is", "writing"]

# Using '+'
def concat_plus():
    result = ""
    for word in mylist:
        result += word + " "
    return result

# Using 'join()'
def concat_join():
    return " ".join(mylist)

# Directly concatenating strings, without a list
def concat_directly():
    return "Yang" + "Zhou" + "is" + "writing"

What’s your first guess on which one is the fastest and which is the slowest?

The actual results might surprise you:

import timeit

print(timeit.timeit(concat_plus, number=10000))
# 0.002738415962085128
print(timeit.timeit(concat_join, number=10000))
# 0.0008482920238748193
print(timeit.timeit(concat_directly, number=10000))
# 0.00021425005979835987

As shown, for concatenating a list of strings, the join() method is faster than using a for loop with +=. This is because in Python, strings are immutable, and every += operation creates a new string, which is computationally expensive.

The .join() method, on the other hand, is optimized for concatenation. It calculates the total size of the resulting string beforehand and constructs it in one go, avoiding the overhead of +=.

Interestingly, the fastest method in our test is directly concatenating string literals. This is due to the fact that Python can optimize string literals at compile time, transforming them into a single string, making it a highly efficient operation.

In short, if you need to concatenate a list of strings, use join() instead of +=. If you’re directly concatenating string literals, simply use +.

2. Faster List Creation: Use "[]" Instead of "list()"

Creating a list may seem trivial, but there are two common ways to do it:

  1. Using the list() function

  2. Using []

Let’s compare their performance using a simple code snippet:

import timeit

print(timeit.timeit('[]', number=10 ** 7))
# 0.1368238340364769
print(timeit.timeit(list, number=10 ** 7))
# 0.2958830420393497

As the results show, using [] is faster than using the list() function. This is because [] is a literal syntax, while list() is a function call, which inherently takes more time.

The same logic applies when creating dictionaries: use {} instead of dict().

3. Faster Membership Testing: Use Sets Instead of Lists

The performance of membership testing depends heavily on the underlying data structure:

import timeit

large_dataset = range(100000)
search_element = 2077

large_list = list(large_dataset)
large_set = set(large_dataset)

def list_membership_test():
    return search_element in large_list

def set_membership_test():
    return search_element in large_set

print(timeit.timeit(list_membership_test, number=1000))
# 0.01112208398990333
print(timeit.timeit(set_membership_test, number=1000))
# 3.27499583363533e-05

As shown, membership testing with a set is much faster than with a list. This is because:

  • In Python lists, membership testing (element in list) requires scanning each element until the desired element is found or the list ends. This operation has a time complexity of O(n).

  • In Python sets, which are implemented as hash tables, membership testing (element in set) uses a hashing mechanism with an average time complexity of O(1).

Key takeaway: Carefully choose your data structure based on your needs. Using the right one can significantly speed up your code.

4. Faster Data Generation: Use Comprehensions Instead of for Loops

Python has four types of comprehensions: list, dictionary, set, and generator comprehensions. These offer not only more concise syntax but also better performance compared to traditional for loops, as they are optimized in Python’s C implementation.

import timeit

def generate_squares_for_loop():
    squares = []
    for i in range(1000):
        squares.append(i * i)
    return squares

def generate_squares_comprehension():
    return [i * i for i in range(1000)]

print(timeit.timeit(generate_squares_for_loop, number=10000))
# 0.2797503340989351
print(timeit.timeit(generate_squares_comprehension, number=10000))
# 0.2364629579242319

As the results show, list comprehensions are faster than for loops.

5. Faster Loops: Prefer Local Variables

Accessing local variables is faster than accessing global variables or object attributes in Python.

Here’s an example:

import timeit

class Example:
    def __init__(self):
        self.value = 0

obj = Example()

def test_dot_notation():
    for _ in range(1000):
        obj.value += 1

def test_local_variable():
    value = obj.value
    for _ in range(1000):
        value += 1
    obj.value = value

print(timeit.timeit(test_dot_notation, number=1000))
# 0.036605041939765215
print(timeit.timeit(test_local_variable, number=1000))
# 0.024470250005833805

The reason is straightforward: when a function is compiled, local variables are known and accessed faster than external variables.

These small optimizations can make a big difference when handling large datasets.

By following these tips, you’ll be well on your way to writing more efficient and faster Python code!

6. Faster Execution: Prefer Built-in Modules and Libraries

When engineers refer to Python, they often mean CPython, the default and most widely used implementation of the Python language.

Since most of CPython’s built-in modules and libraries are written in C, a much faster lower-level language, we should take advantage of this built-in arsenal and avoid reinventing the wheel.

import timeit
import random
from collections import Counter

def count_frequency_custom(lst):
    frequency = {}
    for item in lst:
        if item in frequency:
            frequency[item] += 1
        else:
            frequency[item] = 1
    return frequency

def count_frequency_builtin(lst):
    return Counter(lst)

large_list = [random.randint(0, 100) for _ in range(1000)]

print(timeit.timeit(lambda: count_frequency_custom(large_list), number=100))
# 0.005160166998393834
print(timeit.timeit(lambda: count_frequency_builtin(large_list), number=100))
# 0.002444291952997446

This program compares two methods of counting element frequencies in a list. As shown, using the built-in Counter from the collections module is faster, cleaner, and more efficient.

7. Faster Function Calls: Use the Cache Decorator for Simple Memoization

Caching is a common technique to avoid repetitive calculations and speed up programs.

Fortunately, in most cases, we don’t need to write our own caching logic since Python provides a ready-made decorator for this purpose: @functools.cache.

For example, the following code compares two Fibonacci functions: one with a cache decorator and one without:

import timeit
import functools

def fibonacci(n):
    if n in (0, 1):
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

@functools.cache
def fibonacci_cached(n):
    if n in (0, 1):
        return n
    return fibonacci_cached(n - 1) + fibonacci_cached(n - 2)

# Testing execution time for each function
print(timeit.timeit(lambda: fibonacci(30), number=1))
# 0.09499712497927248
print(timeit.timeit(lambda: fibonacci_cached(30), number=1))
# 6.458023563027382e-06

The results show that the functools.cache decorator makes the code significantly faster.

The basic Fibonacci function is inefficient because it recalculates the same Fibonacci numbers multiple times. The cached version is much faster because it stores previously computed results. Subsequent calls with the same arguments retrieve results from the cache instead of recalculating.

Adding a single built-in decorator can bring significant improvements—that’s what being “Pythonic” is all about! 😎

8. Faster Infinite Loops: Prefer while 1 Over while True

To create an infinite while loop, we can use either while True or while 1.

The performance difference is usually negligible. Interestingly, while 1 is slightly faster.

This is because 1 is a literal, whereas True is a global name that needs to be looked up in Python’s global scope, adding a tiny overhead.

Let's compare these two methods in the code snippet:

import timeit

def loop_with_true():
    i = 0
    while True:
        if i >= 1000:
            break
        i += 1

def loop_with_one():
    i = 0
    while 1:
        if i >= 1000:
            break
        i += 1

print(timeit.timeit(loop_with_true, number=10000))
# 0.1733035419601947
print(timeit.timeit(loop_with_one, number=10000))
# 0.16412191605195403

As shown, while 1 is indeed slightly faster.

However, modern Python interpreters (like CPython) are highly optimized, and this difference is typically negligible. Moreover, while True is more readable than while 1.

9. Faster Startup: Import Python Modules Wisely

It seems natural to import all modules at the top of a Python script.

However, this is not always necessary.

Additionally, if a module is large, it’s better to import it on demand.

def my_function():
    import heavy_module
    # Rest of the function

In the example above, heavy_module is imported inside the function. This follows a “lazy loading” concept, where the import is delayed until my_function is called.

The benefit of this approach is that if my_function is never invoked during script execution, heavy_module will never be loaded, saving resources and reducing startup time.