In discussions about programming languages, you often hear the complaint that "Python is too slow," which often overshadows many of Python's strengths. In reality, if you write Pythonic code, Python can actually be very fast.
Details make the difference. Experienced Python developers use a variety of subtle but powerful techniques that can significantly improve performance. While these tips may seem minor at first glance, they can lead to huge gains in efficiency. Let’s dive into 9 methods that will change how you write and optimize Python code.
1. Faster String Concatenation: Choose "join()" Over "+"
When working with a lot of strings, concatenation can become a bottleneck in your Python programs. In Python, there are basically two ways to concatenate strings:
Using
join()
to merge a list of stringsUsing
+
or+=
to add each string to another string
Which method is faster? Let's define three different functions to concatenate the same strings:
mylist = ["Yang", "Zhou", "is", "writing"] # Using '+' def concat_plus(): result = "" for word in mylist: result += word + " " return result # Using 'join()' def concat_join(): return " ".join(mylist) # Directly concatenating strings, without a list def concat_directly(): return "Yang" + "Zhou" + "is" + "writing"
What’s your first guess on which one is the fastest and which is the slowest?
The actual results might surprise you:
import timeit print(timeit.timeit(concat_plus, number=10000)) # 0.002738415962085128 print(timeit.timeit(concat_join, number=10000)) # 0.0008482920238748193 print(timeit.timeit(concat_directly, number=10000)) # 0.00021425005979835987
As shown, for concatenating a list of strings, the join()
method is faster than using a for
loop with +=
. This is because in Python, strings are immutable, and every +=
operation creates a new string, which is computationally expensive.
The .join()
method, on the other hand, is optimized for concatenation. It calculates the total size of the resulting string beforehand and constructs it in one go, avoiding the overhead of +=
.
Interestingly, the fastest method in our test is directly concatenating string literals. This is due to the fact that Python can optimize string literals at compile time, transforming them into a single string, making it a highly efficient operation.
In short, if you need to concatenate a list of strings, use join()
instead of +=
. If you’re directly concatenating string literals, simply use +
.
2. Faster List Creation: Use "[]" Instead of "list()"
Creating a list may seem trivial, but there are two common ways to do it:
Using the
list()
functionUsing
[]
Let’s compare their performance using a simple code snippet:
import timeit print(timeit.timeit('[]', number=10 ** 7)) # 0.1368238340364769 print(timeit.timeit(list, number=10 ** 7)) # 0.2958830420393497
As the results show, using []
is faster than using the list()
function. This is because []
is a literal syntax, while list()
is a function call, which inherently takes more time.
The same logic applies when creating dictionaries: use {}
instead of dict()
.
3. Faster Membership Testing: Use Sets Instead of Lists
The performance of membership testing depends heavily on the underlying data structure:
import timeit large_dataset = range(100000) search_element = 2077 large_list = list(large_dataset) large_set = set(large_dataset) def list_membership_test(): return search_element in large_list def set_membership_test(): return search_element in large_set print(timeit.timeit(list_membership_test, number=1000)) # 0.01112208398990333 print(timeit.timeit(set_membership_test, number=1000)) # 3.27499583363533e-05
As shown, membership testing with a set is much faster than with a list. This is because:
In Python lists, membership testing (
element in list
) requires scanning each element until the desired element is found or the list ends. This operation has a time complexity of O(n).In Python sets, which are implemented as hash tables, membership testing (
element in set
) uses a hashing mechanism with an average time complexity of O(1).
Key takeaway: Carefully choose your data structure based on your needs. Using the right one can significantly speed up your code.
4. Faster Data Generation: Use Comprehensions Instead of for
Loops
Python has four types of comprehensions: list, dictionary, set, and generator comprehensions. These offer not only more concise syntax but also better performance compared to traditional for
loops, as they are optimized in Python’s C implementation.
import timeit def generate_squares_for_loop(): squares = [] for i in range(1000): squares.append(i * i) return squares def generate_squares_comprehension(): return [i * i for i in range(1000)] print(timeit.timeit(generate_squares_for_loop, number=10000)) # 0.2797503340989351 print(timeit.timeit(generate_squares_comprehension, number=10000)) # 0.2364629579242319
As the results show, list comprehensions are faster than for
loops.
5. Faster Loops: Prefer Local Variables
Accessing local variables is faster than accessing global variables or object attributes in Python.
Here’s an example:
import timeit class Example: def __init__(self): self.value = 0 obj = Example() def test_dot_notation(): for _ in range(1000): obj.value += 1 def test_local_variable(): value = obj.value for _ in range(1000): value += 1 obj.value = value print(timeit.timeit(test_dot_notation, number=1000)) # 0.036605041939765215 print(timeit.timeit(test_local_variable, number=1000)) # 0.024470250005833805
The reason is straightforward: when a function is compiled, local variables are known and accessed faster than external variables.
These small optimizations can make a big difference when handling large datasets.
By following these tips, you’ll be well on your way to writing more efficient and faster Python code!
6. Faster Execution: Prefer Built-in Modules and Libraries
When engineers refer to Python, they often mean CPython, the default and most widely used implementation of the Python language.
Since most of CPython’s built-in modules and libraries are written in C, a much faster lower-level language, we should take advantage of this built-in arsenal and avoid reinventing the wheel.
import timeit import random from collections import Counter def count_frequency_custom(lst): frequency = {} for item in lst: if item in frequency: frequency[item] += 1 else: frequency[item] = 1 return frequency def count_frequency_builtin(lst): return Counter(lst) large_list = [random.randint(0, 100) for _ in range(1000)] print(timeit.timeit(lambda: count_frequency_custom(large_list), number=100)) # 0.005160166998393834 print(timeit.timeit(lambda: count_frequency_builtin(large_list), number=100)) # 0.002444291952997446
This program compares two methods of counting element frequencies in a list. As shown, using the built-in Counter
from the collections
module is faster, cleaner, and more efficient.
7. Faster Function Calls: Use the Cache Decorator for Simple Memoization
Caching is a common technique to avoid repetitive calculations and speed up programs.
Fortunately, in most cases, we don’t need to write our own caching logic since Python provides a ready-made decorator for this purpose: @functools.cache
.
For example, the following code compares two Fibonacci functions: one with a cache decorator and one without:
import timeit import functools def fibonacci(n): if n in (0, 1): return n return fibonacci(n - 1) + fibonacci(n - 2) @functools.cache def fibonacci_cached(n): if n in (0, 1): return n return fibonacci_cached(n - 1) + fibonacci_cached(n - 2) # Testing execution time for each function print(timeit.timeit(lambda: fibonacci(30), number=1)) # 0.09499712497927248 print(timeit.timeit(lambda: fibonacci_cached(30), number=1)) # 6.458023563027382e-06
The results show that the functools.cache
decorator makes the code significantly faster.
The basic Fibonacci function is inefficient because it recalculates the same Fibonacci numbers multiple times. The cached version is much faster because it stores previously computed results. Subsequent calls with the same arguments retrieve results from the cache instead of recalculating.
Adding a single built-in decorator can bring significant improvements—that’s what being “Pythonic” is all about! 😎
8. Faster Infinite Loops: Prefer while 1
Over while True
To create an infinite while
loop, we can use either while True
or while 1
.
The performance difference is usually negligible. Interestingly, while 1
is slightly faster.
This is because 1
is a literal, whereas True
is a global name that needs to be looked up in Python’s global scope, adding a tiny overhead.
Let's compare these two methods in the code snippet:
import timeit def loop_with_true(): i = 0 while True: if i >= 1000: break i += 1 def loop_with_one(): i = 0 while 1: if i >= 1000: break i += 1 print(timeit.timeit(loop_with_true, number=10000)) # 0.1733035419601947 print(timeit.timeit(loop_with_one, number=10000)) # 0.16412191605195403
As shown, while 1
is indeed slightly faster.
However, modern Python interpreters (like CPython) are highly optimized, and this difference is typically negligible. Moreover, while True
is more readable than while 1
.
9. Faster Startup: Import Python Modules Wisely
It seems natural to import all modules at the top of a Python script.
However, this is not always necessary.
Additionally, if a module is large, it’s better to import it on demand.
def my_function(): import heavy_module # Rest of the function
In the example above, heavy_module
is imported inside the function. This follows a “lazy loading” concept, where the import is delayed until my_function
is called.
The benefit of this approach is that if my_function
is never invoked during script execution, heavy_module
will never be loaded, saving resources and reducing startup time.