Since much of the dynamic information cannot be statically stored in the
PyCodeObject
, once aPyCodeObject
is handed over to the virtual machine, it dynamically constructs aPyFrameObject
, which is the stack frame.
Prelude
From now on, we will delve into the principles behind how the virtual machine executes bytecode. As mentioned before, the Python interpreter can be divided into two parts: the Python compiler and the Python virtual machine.
Once the compiler compiles the source code into a PyCodeObject
, the virtual machine takes over. The virtual machine reads the bytecode from the PyCodeObject
and executes it in the current context until all the bytecode instructions are completed.
Now, here's a question: since the bytecode instructions and static information are stored in the PyCodeObject
after compilation, does this mean that the virtual machine performs all operations directly on the PyCodeObject
?
Clearly not, because although the PyCodeObject
contains key bytecode instructions and static information, it lacks one critical aspect: the runtime execution environment. In Python, this execution environment is the stack frame.
Stack Frames: The Virtual Machine’s Execution Environment
What is a stack frame? Let’s consider an example:
name = "Koishi Komeiji" def some_func(): name = "Eirin Yagokoro" print(name) some_func() print(name)
In this code, there are two print(name)
statements. Although they share the same bytecode instructions, the outcomes are clearly different. The difference in result stems from the different execution environments. Since the environments differ, so does the value of name
.
Thus, the same symbol can point to different values or types in different environments, and this dynamic information must be captured and managed at runtime. Such information cannot be statically stored in the PyCodeObject
.
Therefore, the virtual machine doesn’t perform operations directly on the PyCodeObject
but rather on the stack frame object. When the virtual machine executes code, it dynamically creates a stack frame object based on the PyCodeObject
and then executes the bytecode within the stack frame. The stack frame thus acts as the execution context for the virtual machine, storing all the information needed during execution.
Here’s a rough flow of the above code:
A stack frame (let’s call it A) is created based on the module’s
PyCodeObject
. All the bytecode executes within this stack frame. The virtual machine can retrieve or modify variables from within this stack frame.When a function is called (here,
some_func
), the virtual machine creates a new stack frame (call it B) on top of frame A for the function, and it executes the bytecode forsome_func
within frame B.There is also a variable named
name
in frame B, but because the execution environment (i.e., the stack frame) is different,name
refers to a different object.Once the bytecode for
some_func
is completely executed, frame B is destroyed (or can be preserved), and control returns to the caller’s stack frame, A. Similar to recursion, every time a function is called, a new stack frame is created on top of the current one, and these frames are returned to one by one after execution.
Python Virtual Machine and the Operating System
It's easy to see that the process by which the Python virtual machine executes bytecode mirrors how an operating system runs an executable file. Consider the following analogies:
Program Loading
Operating System: Loads the executable into memory and sets the program counter.
Python Virtual Machine: Loads the
PyCodeObject
from the.pyc
file and initializes the bytecode instruction pointer.Memory Management
Operating System: Allocates memory for the process, manages heap and stack.
Python Virtual Machine: Creates and manages Python objects, handles memory allocation, and garbage collection.
Instruction Execution
Operating System: The CPU executes machine instructions one by one.
Python Virtual Machine: The virtual machine executes bytecode instructions one by one.
Resource Management
Operating System: Manages file handles, network connections, and other system resources.
Python Virtual Machine: Manages file objects, sockets, and other Python-level resources.
Exception Handling
Operating System: Handles hardware interrupts and software exceptions.
Python Virtual Machine: Captures and handles Python exceptions.
Let’s use a diagram to illustrate how an executable runs on a typical x64 machine, focusing primarily on the changes in stack frames. Assume there are three functions, where function f
calls g
, and g
calls h
.
Key CPU Registers
Two critical CPU registers play a key role in function calls and stack frame management:
RSP (Stack Pointer): Points to the top of the current stack frame or the last element pushed onto the stack. As elements are pushed onto or popped off the stack, the RSP adjusts accordingly. Since memory addresses decrease from the stack base to the stack top, the RSP decreases as data is pushed onto the stack and increases as data is popped off. Regardless, it always points to the top of the stack.
RBP (Base Pointer): Points to the base of the current stack frame. Its purpose is to provide a fixed reference point to access local variables and parameters of the current function. When a new frame is created, it stores the base of the previous frame, and the RBP points to this new base.
C Code Example
#include <stdio.h> int add(int a, int b) { int c = a + b; return c; } int main() { int a = 11; int b = 22; int result = add(a, b); printf("a + b = %d ", result); }
When executing the function add
, the current frame is clearly the stack frame for add
, and the caller’s frame (the previous stack frame) is the frame for main
.
A stack is a last-in-first-out data structure, with memory addresses decreasing as you move from the base to the top. For each function, all local variable operations occur within its stack frame, and a new stack frame is created when a function is called.
When the main
function is running, the RSP points to the top of the main
stack frame, and the RBP points to the base of the main
stack frame. When add
is called, the system creates a stack frame for add
above the main
stack frame in the address space, moving the RSP to the top of the add
stack frame and the RBP to its base. The base of the add
stack frame stores the base of the previous stack frame (main
).
When the add
function finishes executing, the corresponding stack frame is destroyed, and the RSP and RBP are restored to their values before the add
stack frame was created. This returns the execution flow and the runtime space to the main
function’s stack frame.
How Python's Stack Frames Work
This is similar to how executable files run on x64 machines. But how do stack frames work in Python?
The Underlying Structure of Stack Frames
Compared to the simple stack frames seen in x64 machines, Python’s stack frames contain much more information. Notably, a stack frame is also an object.
// Include/pytypedefs.h typedef struct _frame PyFrameObject; // Include/internal/pycore_frame.h struct _frame { PyObject_HEAD PyFrameObject *f_back; struct _PyInterpreterFrame *f_frame; PyObject *f_trace; int f_lineno; char f_trace_lines; char f_trace_opcodes; char f_fast_as_locals; }; typedef struct _PyInterpreterFrame { PyCodeObject *f_code; struct _PyInterpreterFrame *previous; PyObject *f_funcobj; PyObject *f_globals; PyObject *f_builtins; PyObject *f_locals; PyFrameObject *frame_obj; _Py_CODEUNIT *prev_instr; int stacktop; uint16_t return_offset; char owner; PyObject *localsplus[1]; } _PyInterpreterFrame;
Before Python 3.11, stack frames were represented by the PyFrameObject
structure, which contained all the fields for managing execution. However, many fields were often unused, such as those for debugging. These unused fields led to memory waste because memory was allocated for every field whenever a stack frame was created.
From Python 3.11 onwards, the core fields of PyFrameObject
were extracted to form a lighter structure, _PyInterpreterFrame
, which reduces memory usage and improves performance.
_PyInterpreterFrame: This is the core structure of the stack frame, a lightweight C structure containing only the essential information needed for execution. The virtual machine uses it internally.
PyFrameObject: This is the complete stack frame object, used when more comprehensive frame information is needed. For example, when retrieving stack frames from Python code, the object corresponds to the
PyFrameObject
structure underneath.
This separation allows the virtual machine to use the lightweight _PyInterpreterFrame
in most cases, only creating the full PyFrameObject
when complete frame information is required.
It’s important to note that _PyInterpreterFrame
does not contain PyObject
s, so it is not a Python object itself—it only contains the core structure of the stack frame. The actual stack frame object is still PyFrameObject
. However, from the virtual machine's perspective, many tasks can be accomplished by just using the _PyInterpreterFrame
structure.
Additionally, aside from being lightweight and compact, _PyInterpreterFrame
is also highly CPU cache-friendly.
We know that Python objects are allocated on the heap, and stack frames are no exception. When calling nested functions, these stack frame objects are scattered in different locations in the heap, which is not ideal for cache efficiency. However, _PyInterpreterFrame
is different—the virtual machine introduces a special stack, a pre-allocated memory area specifically for storing _PyInterpreterFrame
instances.
When a _PyInterpreterFrame
instance needs to be created, it simply adjusts the stack pointer, and the memory is ready. When it needs to be destroyed, it is popped off the top of the stack without explicit memory release. Since these frames are tightly packed together, they are much more cache-friendly.
Field Definitions and Code Demonstration
Let’s dive into the meanings of the fields in these two structures. But before explaining the fields, we first need to understand how to retrieve a stack frame object in Python.
import inspect def foo(): # Returns the current stack frame # This function actually calls sys._getframe(1) return inspect.currentframe() frame = foo() print(frame) """ <frame at 0x100de0fc0, file '.../main.py', line 6, code foo> """ print(type(frame)) """ <class 'frame'> """
We can see that the type of the stack frame is <class 'frame'>
, just like how PyCodeObject
is of type <class 'code'>
. These classes are not directly exposed to us, so we cannot use them directly.
Similarly, Python functions are of type <class 'function'>
, and modules are of type <class 'module'>
. The interpreter does not expose these classes to us either, so if we tried to use them directly, frame
, code
, function
, and module
would just be undefined variables. We can only access them indirectly.
Now, let’s look at what each field in PyFrameObject
represents.
PyObject_HEAD
This is the header information for the object, so the stack frame is also an object.
*PyFrameObject f_back
This field points to the previous stack frame, i.e., the caller’s stack frame. On x64 machines, the relationship between function calls is maintained through the RSP and RBP pointers. In the Python virtual machine, this relationship is maintained through the f_back
field in stack frames.
import inspect def foo(): return inspect.currentframe() frame = foo() print(frame) """ <frame at 0x100de0fc0, file '.../main.py', line 6, code foo> """ # The previous stack frame of foo, which corresponds to the module’s stack frame print(frame.f_back) """ <frame at 0x100adde40, file '.../main.py', line 12, code <module>> """ # The previous stack frame of the module, which is None print(frame.f_back.f_back) """ None """
By traversing through stack frames, you can easily get the complete call chain of a function. We will demonstrate this shortly.
Translation of the Article on Python's Virtual Machine Stack Frame:
struct _PyInterpreterFrame *f_frame
This points to an instance of struct _PyInterpreterFrame
, which contains the core structure of the stack frame.
PyObject *f_trace
This is the trace function, used for debugging.
int f_lineno
This returns the line number in the source code at the time the stack frame was captured.
import inspect def foo(): return inspect.currentframe() frame = foo() print(frame.f_lineno) # 4
In the above code, the stack frame is captured at line 4, so the output is 4
.
char f_trace_lines
This field indicates whether the trace function should be called for every line of code. When set to true (non-zero), the virtual machine will call the trace function each time a new line of code is executed. This allows debuggers to intervene when each line of code is executed, such as by setting breakpoints or inspecting variables.
char f_trace_opcodes
This field indicates whether the trace function should be called for each bytecode instruction. When set to true, the virtual machine will call the trace function before executing each bytecode instruction. This provides finer control, allowing for instruction-level debugging but with higher overhead.
So, we can see that f_trace_lines
is line-level tracing, corresponding to each line of source code. It is typically used for regular debugging, such as setting breakpoints or stepping through code, and has relatively low overhead. f_trace_opcodes
, on the other hand, is instruction-level tracing, corresponding to each bytecode instruction, and is used for deeper debugging, such as analyzing the execution of specific bytecode instructions, but it incurs a higher overhead.
import sys def trace_lines(frame, event, arg): print(f"Line number: {frame.f_lineno}, File name: {frame.f_code.co_filename}") return trace_lines sys.settrace(trace_lines)
Setting a trace function is typically done using sys.settrace
, though it’s not commonly used, and it’s good to be familiar with it.
char f_fast_as_locals
To understand this field, we’ll need to delve into further knowledge later. For now, it’s enough to know that Python functions store their local variables in an array for quick access, referred to as "fast locals."
However, sometimes we need a dictionary that contains all the local variables. In such cases, the locals
function can be called, which copies the names and values of local variables into a dictionary as key-value pairs. The f_fast_as_locals
field marks whether this copying process has occurred.
Now, let’s take a look at the fields inside the _PyInterpreterFrame
structure. The core fields of the stack frame are housed in this structure.
PyCodeObject *f_code
The stack frame is built upon the PyCodeObject
, so there is a field inside that points to the PyCodeObject
.
import inspect def e(): f() def f(): g() def g(): h() def h(): frame = inspect.currentframe() # Get the current stack frame func_names = [] # Loop until the frame is None, and add function names to the list while frame is not None: func_names.append(frame.f_code.co_name) frame = frame.f_back print(f"Function call chain: {' -> '.join(func_names[::-1])}") f()
The output will be:
Function call chain: <module> -> f -> g -> h
In the output, we see the entire function call chain, which is pretty interesting, right?
struct _PyInterpreterFrame *previous
This points to the previous struct _PyInterpreterFrame
. This field is not exposed at the lower level.
PyObject *f_funcobj
This points to the corresponding function object. This field is also not exposed by the interpreter.
PyObject *f_globals
This points to the global namespace (a dictionary), which houses global variables. Yes, Python’s global variables are stored in a dictionary, which you can access using the globals
function to retrieve this dictionary.
# Equivalent to name = "Koishi Komeiji" globals()["name"] = "Koishi Komeiji" # Equivalent to print(name) print(globals()["name"]) # Koishi Komeiji def foo(): import inspect return inspect.currentframe() frame = foo() # frame.f_globals also returns the global namespace print(frame.f_globals is globals()) # True # This is equivalent to creating a global variable age frame.f_globals["age"] = 18 print(age) # 18
We will later dedicate a separate section to explain namespaces in detail.
Translation of the Article on Python's Stack Frame Fields
PyObject *f_locals
This points to the local namespace (a dictionary). However, unlike global variables, local variables do not exist in the local namespace but are statically stored in an array. Keep this in mind for now, and we’ll explain it in more detail later.
PyObject *f_builtins
This points to the built-in namespace (a dictionary) where all the built-in variables are stored.
def foo(): import inspect return inspect.currentframe() frame = foo() print(frame.f_builtins["list"]("abcd")) # Output: ['a', 'b', 'c', 'd']
This is equivalent to using list("abcd")
directly.
PyFrameObject *frame_obj
This points to the PyFrameObject
. It doesn’t need much explanation as it directly references the frame object.
_Py_CODEUNIT *prev_instr
This points to the last bytecode instruction that was executed. For instance, if the virtual machine is about to execute the nth instruction, prev_instr
points to the (n-1)th instruction. Each instruction is paired with a parameter, so the size of _Py_CODEUNIT
is 2 bytes.
int stacktop
This indicates the offset of the stack top relative to the localsplus
array.
uint16_t return_offset
This indicates the offset of the RETURN
instruction relative to prev_instr
. This value is only meaningful for called functions and indicates where the caller should resume execution after the function returns. It is set during CALL
instructions (when a function is called) and SEND
instructions (when data is sent to coroutines or generators).
This design allows for more efficient function return handling, as the virtual machine can directly jump to the correct position without additional lookups or calculations.
def main(): x = some_func() # CALL instruction happens here y = x + 1 # The next instruction after the function returns def some_func(): return 42
When calling some_func
, the virtual machine executes the CALL
instruction, and the return_offset
is set. After executing the RETURN
instruction in some_func
, it uses return_offset
to determine where to jump back to in the caller (the main
function).
The advantage of this mechanism is that it eliminates the need to calculate the return position at runtime since it’s pre-calculated during the call, which is especially useful for handling complex control flows like generators and coroutines.
char owner
This indicates the ownership of the frame, used to distinguish whether the frame is on the virtual machine stack or separately allocated.
PyObject *localsplus[1]
This is a flexible array that maintains "local variables + cell variables + free variables + runtime stack." Its size is determined at runtime.
The above fields outline the internals of a stack frame. You can take note of them for now, as we’ll dive deeper into their workings when we explore the virtual machine in more detail.
In summary, we can see that the PyCodeObject
is not the ultimate target of the virtual machine. The virtual machine executes in the context of stack frames. Each stack frame maintains a PyCodeObject
, meaning that each PyCodeObject
belongs to a stack frame. Additionally, from the f_back
field, we can see that during execution, the virtual machine creates multiple stack frame objects that are linked together, forming a chain of execution contexts, or a stack frame chain.
This simulates the relationship between stack frames on x64 machines, where stack frames are connected via the RSP
and RBP
pointers, allowing the new stack frame to return to the old stack frame after completion. In Python's virtual machine, the f_back
field is used to accomplish this.
Moreover, in addition to retrieving the stack frame via the inspect
module, it’s also possible to get the stack frame when catching exceptions.
def foo(): try: 1 / 0 except ZeroDivisionError: import sys # exc_info returns a tuple: the exception type, value, and traceback exc_type, exc_value, exc_tb = sys.exc_info() print(exc_type) # <class 'ZeroDivisionError'> print(exc_value) # division by zero print(exc_tb) # <traceback object> # Use exc_tb.tb_frame to get the stack frame associated with the exception # Alternatively, you can use this method: except ZeroDivisionError as e; e.__traceback__ print(exc_tb.tb_frame.f_code.co_name) # foo print(exc_tb.tb_frame.f_back.f_code.co_name) # <module> # tb_frame is the stack frame for the current function foo # tb_frame.f_back is the stack frame for the entire module # And tb_frame.f_back.f_back is obviously None print(exc_tb.tb_frame.f_back.f_back) # None foo()
This completes our explanation of the meanings of the stack frame's internal fields. If you don’t fully understand some of these fields now, that’s okay — as you continue learning, it will all become clear.
Summary
Since much of the dynamic information cannot be statically stored in the PyCodeObject
, after it’s handed over to the virtual machine, the virtual machine dynamically constructs a PyFrameObject
(stack frame) on top of it.
Thus, the virtual machine executes bytecode within the stack frame, which contains all the information the virtual machine needs to execute the bytecode.