Deep Dive into the Execution Environment of Python's Virtual Machine: Stack Frame Objects

Time: Column:Python views:306

Since much of the dynamic information cannot be statically stored in the PyCodeObject, once a PyCodeObject is handed over to the virtual machine, it dynamically constructs a PyFrameObject, which is the stack frame.

Prelude

From now on, we will delve into the principles behind how the virtual machine executes bytecode. As mentioned before, the Python interpreter can be divided into two parts: the Python compiler and the Python virtual machine.

Once the compiler compiles the source code into a PyCodeObject, the virtual machine takes over. The virtual machine reads the bytecode from the PyCodeObject and executes it in the current context until all the bytecode instructions are completed.

Now, here's a question: since the bytecode instructions and static information are stored in the PyCodeObject after compilation, does this mean that the virtual machine performs all operations directly on the PyCodeObject?

Clearly not, because although the PyCodeObject contains key bytecode instructions and static information, it lacks one critical aspect: the runtime execution environment. In Python, this execution environment is the stack frame.

Stack Frames: The Virtual Machine’s Execution Environment

What is a stack frame? Let’s consider an example:

name = "Koishi Komeiji"

def some_func():
    name = "Eirin Yagokoro"
    print(name)

some_func()
print(name)

In this code, there are two print(name) statements. Although they share the same bytecode instructions, the outcomes are clearly different. The difference in result stems from the different execution environments. Since the environments differ, so does the value of name.

Thus, the same symbol can point to different values or types in different environments, and this dynamic information must be captured and managed at runtime. Such information cannot be statically stored in the PyCodeObject.

Therefore, the virtual machine doesn’t perform operations directly on the PyCodeObject but rather on the stack frame object. When the virtual machine executes code, it dynamically creates a stack frame object based on the PyCodeObject and then executes the bytecode within the stack frame. The stack frame thus acts as the execution context for the virtual machine, storing all the information needed during execution.

Here’s a rough flow of the above code:

  1. A stack frame (let’s call it A) is created based on the module’s PyCodeObject. All the bytecode executes within this stack frame. The virtual machine can retrieve or modify variables from within this stack frame.

  2. When a function is called (here, some_func), the virtual machine creates a new stack frame (call it B) on top of frame A for the function, and it executes the bytecode for some_func within frame B.

  3. There is also a variable named name in frame B, but because the execution environment (i.e., the stack frame) is different, name refers to a different object.

  4. Once the bytecode for some_func is completely executed, frame B is destroyed (or can be preserved), and control returns to the caller’s stack frame, A. Similar to recursion, every time a function is called, a new stack frame is created on top of the current one, and these frames are returned to one by one after execution.

Python Virtual Machine and the Operating System

It's easy to see that the process by which the Python virtual machine executes bytecode mirrors how an operating system runs an executable file. Consider the following analogies:

  • Program Loading

    • Operating System: Loads the executable into memory and sets the program counter.

    • Python Virtual Machine: Loads the PyCodeObject from the .pyc file and initializes the bytecode instruction pointer.

  • Memory Management

    • Operating System: Allocates memory for the process, manages heap and stack.

    • Python Virtual Machine: Creates and manages Python objects, handles memory allocation, and garbage collection.

  • Instruction Execution

    • Operating System: The CPU executes machine instructions one by one.

    • Python Virtual Machine: The virtual machine executes bytecode instructions one by one.

  • Resource Management

    • Operating System: Manages file handles, network connections, and other system resources.

    • Python Virtual Machine: Manages file objects, sockets, and other Python-level resources.

  • Exception Handling

    • Operating System: Handles hardware interrupts and software exceptions.

    • Python Virtual Machine: Captures and handles Python exceptions.

Let’s use a diagram to illustrate how an executable runs on a typical x64 machine, focusing primarily on the changes in stack frames. Assume there are three functions, where function f calls g, and g calls h.

Deep Dive into the Execution Environment of Python's Virtual Machine: Stack Frame Objects

Key CPU Registers

Two critical CPU registers play a key role in function calls and stack frame management:

  • RSP (Stack Pointer): Points to the top of the current stack frame or the last element pushed onto the stack. As elements are pushed onto or popped off the stack, the RSP adjusts accordingly. Since memory addresses decrease from the stack base to the stack top, the RSP decreases as data is pushed onto the stack and increases as data is popped off. Regardless, it always points to the top of the stack.

  • RBP (Base Pointer): Points to the base of the current stack frame. Its purpose is to provide a fixed reference point to access local variables and parameters of the current function. When a new frame is created, it stores the base of the previous frame, and the RBP points to this new base.

C Code Example

#include <stdio.h>

int add(int a, int b) {
    int c = a + b;
    return c;
}

int main() {
    int a = 11;
    int b = 22;
    int result = add(a, b);
    printf("a + b = %d
", result);
}

When executing the function add, the current frame is clearly the stack frame for add, and the caller’s frame (the previous stack frame) is the frame for main.

A stack is a last-in-first-out data structure, with memory addresses decreasing as you move from the base to the top. For each function, all local variable operations occur within its stack frame, and a new stack frame is created when a function is called.

When the main function is running, the RSP points to the top of the main stack frame, and the RBP points to the base of the main stack frame. When add is called, the system creates a stack frame for add above the main stack frame in the address space, moving the RSP to the top of the add stack frame and the RBP to its base. The base of the add stack frame stores the base of the previous stack frame (main).

When the add function finishes executing, the corresponding stack frame is destroyed, and the RSP and RBP are restored to their values before the add stack frame was created. This returns the execution flow and the runtime space to the main function’s stack frame.

How Python's Stack Frames Work

This is similar to how executable files run on x64 machines. But how do stack frames work in Python?

The Underlying Structure of Stack Frames

Compared to the simple stack frames seen in x64 machines, Python’s stack frames contain much more information. Notably, a stack frame is also an object.

// Include/pytypedefs.h
typedef struct _frame PyFrameObject;

// Include/internal/pycore_frame.h
struct _frame {
    PyObject_HEAD
    PyFrameObject *f_back;     
    struct _PyInterpreterFrame *f_frame; 
    PyObject *f_trace;          
    int f_lineno;               
    char f_trace_lines;         
    char f_trace_opcodes;       
    char f_fast_as_locals;      
};

typedef struct _PyInterpreterFrame {
    PyCodeObject *f_code; 
    struct _PyInterpreterFrame *previous;
    PyObject *f_funcobj; 
    PyObject *f_globals; 
    PyObject *f_builtins; 
    PyObject *f_locals;
    PyFrameObject *frame_obj;
    _Py_CODEUNIT *prev_instr;
    int stacktop;
    uint16_t return_offset;
    char owner;
    PyObject *localsplus[1];
} _PyInterpreterFrame;

Before Python 3.11, stack frames were represented by the PyFrameObject structure, which contained all the fields for managing execution. However, many fields were often unused, such as those for debugging. These unused fields led to memory waste because memory was allocated for every field whenever a stack frame was created.

From Python 3.11 onwards, the core fields of PyFrameObject were extracted to form a lighter structure, _PyInterpreterFrame, which reduces memory usage and improves performance.

  • _PyInterpreterFrame: This is the core structure of the stack frame, a lightweight C structure containing only the essential information needed for execution. The virtual machine uses it internally.

  • PyFrameObject: This is the complete stack frame object, used when more comprehensive frame information is needed. For example, when retrieving stack frames from Python code, the object corresponds to the PyFrameObject structure underneath.

This separation allows the virtual machine to use the lightweight _PyInterpreterFrame in most cases, only creating the full PyFrameObject when complete frame information is required.

It’s important to note that _PyInterpreterFrame does not contain PyObjects, so it is not a Python object itself—it only contains the core structure of the stack frame. The actual stack frame object is still PyFrameObject. However, from the virtual machine's perspective, many tasks can be accomplished by just using the _PyInterpreterFrame structure.

Additionally, aside from being lightweight and compact, _PyInterpreterFrame is also highly CPU cache-friendly.

We know that Python objects are allocated on the heap, and stack frames are no exception. When calling nested functions, these stack frame objects are scattered in different locations in the heap, which is not ideal for cache efficiency. However, _PyInterpreterFrame is different—the virtual machine introduces a special stack, a pre-allocated memory area specifically for storing _PyInterpreterFrame instances.

When a _PyInterpreterFrame instance needs to be created, it simply adjusts the stack pointer, and the memory is ready. When it needs to be destroyed, it is popped off the top of the stack without explicit memory release. Since these frames are tightly packed together, they are much more cache-friendly.

Field Definitions and Code Demonstration

Let’s dive into the meanings of the fields in these two structures. But before explaining the fields, we first need to understand how to retrieve a stack frame object in Python.

import inspect

def foo():
    # Returns the current stack frame
    # This function actually calls sys._getframe(1)
    return inspect.currentframe()

frame = foo()
print(frame) 
"""
<frame at 0x100de0fc0, file '.../main.py', line 6, code foo>
"""
print(type(frame)) 
"""
<class 'frame'>
"""

We can see that the type of the stack frame is <class 'frame'>, just like how PyCodeObject is of type <class 'code'>. These classes are not directly exposed to us, so we cannot use them directly.

Similarly, Python functions are of type <class 'function'>, and modules are of type <class 'module'>. The interpreter does not expose these classes to us either, so if we tried to use them directly, frame, code, function, and module would just be undefined variables. We can only access them indirectly.

Now, let’s look at what each field in PyFrameObject represents.

PyObject_HEAD

This is the header information for the object, so the stack frame is also an object.

*PyFrameObject f_back

This field points to the previous stack frame, i.e., the caller’s stack frame. On x64 machines, the relationship between function calls is maintained through the RSP and RBP pointers. In the Python virtual machine, this relationship is maintained through the f_back field in stack frames.

import inspect

def foo():
    return inspect.currentframe()

frame = foo()
print(frame)
"""
<frame at 0x100de0fc0, file '.../main.py', line 6, code foo>
"""
# The previous stack frame of foo, which corresponds to the module’s stack frame
print(frame.f_back)
"""
<frame at 0x100adde40, file '.../main.py', line 12, code <module>>
"""
# The previous stack frame of the module, which is None
print(frame.f_back.f_back)
"""
None
"""

By traversing through stack frames, you can easily get the complete call chain of a function. We will demonstrate this shortly.

Translation of the Article on Python's Virtual Machine Stack Frame:

struct _PyInterpreterFrame *f_frame

This points to an instance of struct _PyInterpreterFrame, which contains the core structure of the stack frame.

PyObject *f_trace

This is the trace function, used for debugging.

int f_lineno

This returns the line number in the source code at the time the stack frame was captured.

import inspect

def foo():
    return inspect.currentframe()

frame = foo()
print(frame.f_lineno)  # 4

In the above code, the stack frame is captured at line 4, so the output is 4.

char f_trace_lines

This field indicates whether the trace function should be called for every line of code. When set to true (non-zero), the virtual machine will call the trace function each time a new line of code is executed. This allows debuggers to intervene when each line of code is executed, such as by setting breakpoints or inspecting variables.

char f_trace_opcodes

This field indicates whether the trace function should be called for each bytecode instruction. When set to true, the virtual machine will call the trace function before executing each bytecode instruction. This provides finer control, allowing for instruction-level debugging but with higher overhead.

So, we can see that f_trace_lines is line-level tracing, corresponding to each line of source code. It is typically used for regular debugging, such as setting breakpoints or stepping through code, and has relatively low overhead. f_trace_opcodes, on the other hand, is instruction-level tracing, corresponding to each bytecode instruction, and is used for deeper debugging, such as analyzing the execution of specific bytecode instructions, but it incurs a higher overhead.

import sys

def trace_lines(frame, event, arg):
    print(f"Line number: {frame.f_lineno}, File name: {frame.f_code.co_filename}")
    return trace_lines

sys.settrace(trace_lines)

Setting a trace function is typically done using sys.settrace, though it’s not commonly used, and it’s good to be familiar with it.

char f_fast_as_locals

To understand this field, we’ll need to delve into further knowledge later. For now, it’s enough to know that Python functions store their local variables in an array for quick access, referred to as "fast locals."

However, sometimes we need a dictionary that contains all the local variables. In such cases, the locals function can be called, which copies the names and values of local variables into a dictionary as key-value pairs. The f_fast_as_locals field marks whether this copying process has occurred.

Now, let’s take a look at the fields inside the _PyInterpreterFrame structure. The core fields of the stack frame are housed in this structure.

PyCodeObject *f_code

The stack frame is built upon the PyCodeObject, so there is a field inside that points to the PyCodeObject.

import inspect

def e():
    f()

def f():
    g()

def g():
    h()

def h():
    frame = inspect.currentframe()  # Get the current stack frame
    func_names = []
    # Loop until the frame is None, and add function names to the list
    while frame is not None:
        func_names.append(frame.f_code.co_name)
        frame = frame.f_back
    print(f"Function call chain: {' -> '.join(func_names[::-1])}")

f()

The output will be:

Function call chain: <module> -> f -> g -> h

In the output, we see the entire function call chain, which is pretty interesting, right?

struct _PyInterpreterFrame *previous

This points to the previous struct _PyInterpreterFrame. This field is not exposed at the lower level.

PyObject *f_funcobj

This points to the corresponding function object. This field is also not exposed by the interpreter.

PyObject *f_globals

This points to the global namespace (a dictionary), which houses global variables. Yes, Python’s global variables are stored in a dictionary, which you can access using the globals function to retrieve this dictionary.

# Equivalent to name = "Koishi Komeiji"
globals()["name"] = "Koishi Komeiji"

# Equivalent to print(name)
print(globals()["name"])  # Koishi Komeiji

def foo():
    import inspect
    return inspect.currentframe()

frame = foo()
# frame.f_globals also returns the global namespace
print(frame.f_globals is globals())  # True
# This is equivalent to creating a global variable age
frame.f_globals["age"] = 18
print(age)  # 18

We will later dedicate a separate section to explain namespaces in detail.

Translation of the Article on Python's Stack Frame Fields

PyObject *f_locals

This points to the local namespace (a dictionary). However, unlike global variables, local variables do not exist in the local namespace but are statically stored in an array. Keep this in mind for now, and we’ll explain it in more detail later.

PyObject *f_builtins

This points to the built-in namespace (a dictionary) where all the built-in variables are stored.

def foo():
    import inspect
    return inspect.currentframe()

frame = foo()
print(frame.f_builtins["list"]("abcd"))
# Output: ['a', 'b', 'c', 'd']

This is equivalent to using list("abcd") directly.

PyFrameObject *frame_obj

This points to the PyFrameObject. It doesn’t need much explanation as it directly references the frame object.

_Py_CODEUNIT *prev_instr

This points to the last bytecode instruction that was executed. For instance, if the virtual machine is about to execute the nth instruction, prev_instr points to the (n-1)th instruction. Each instruction is paired with a parameter, so the size of _Py_CODEUNIT is 2 bytes.

int stacktop

This indicates the offset of the stack top relative to the localsplus array.

uint16_t return_offset

This indicates the offset of the RETURN instruction relative to prev_instr. This value is only meaningful for called functions and indicates where the caller should resume execution after the function returns. It is set during CALL instructions (when a function is called) and SEND instructions (when data is sent to coroutines or generators).

This design allows for more efficient function return handling, as the virtual machine can directly jump to the correct position without additional lookups or calculations.

def main():
    x = some_func()  # CALL instruction happens here
    y = x + 1        # The next instruction after the function returns

def some_func():
    return 42

When calling some_func, the virtual machine executes the CALL instruction, and the return_offset is set. After executing the RETURN instruction in some_func, it uses return_offset to determine where to jump back to in the caller (the main function).

The advantage of this mechanism is that it eliminates the need to calculate the return position at runtime since it’s pre-calculated during the call, which is especially useful for handling complex control flows like generators and coroutines.

char owner

This indicates the ownership of the frame, used to distinguish whether the frame is on the virtual machine stack or separately allocated.

PyObject *localsplus[1]

This is a flexible array that maintains "local variables + cell variables + free variables + runtime stack." Its size is determined at runtime.

The above fields outline the internals of a stack frame. You can take note of them for now, as we’ll dive deeper into their workings when we explore the virtual machine in more detail.

In summary, we can see that the PyCodeObject is not the ultimate target of the virtual machine. The virtual machine executes in the context of stack frames. Each stack frame maintains a PyCodeObject, meaning that each PyCodeObject belongs to a stack frame. Additionally, from the f_back field, we can see that during execution, the virtual machine creates multiple stack frame objects that are linked together, forming a chain of execution contexts, or a stack frame chain.

This simulates the relationship between stack frames on x64 machines, where stack frames are connected via the RSP and RBP pointers, allowing the new stack frame to return to the old stack frame after completion. In Python's virtual machine, the f_back field is used to accomplish this.

Moreover, in addition to retrieving the stack frame via the inspect module, it’s also possible to get the stack frame when catching exceptions.

def foo():
    try:
        1 / 0
    except ZeroDivisionError:
        import sys
        # exc_info returns a tuple: the exception type, value, and traceback
        exc_type, exc_value, exc_tb = sys.exc_info()
        print(exc_type)  # <class 'ZeroDivisionError'>
        print(exc_value)  # division by zero
        print(exc_tb)  # <traceback object>

        # Use exc_tb.tb_frame to get the stack frame associated with the exception
        # Alternatively, you can use this method: except ZeroDivisionError as e; e.__traceback__
        print(exc_tb.tb_frame.f_code.co_name)  # foo
        print(exc_tb.tb_frame.f_back.f_code.co_name)  # <module>
        # tb_frame is the stack frame for the current function foo
        # tb_frame.f_back is the stack frame for the entire module
        # And tb_frame.f_back.f_back is obviously None
        print(exc_tb.tb_frame.f_back.f_back)  # None

foo()

This completes our explanation of the meanings of the stack frame's internal fields. If you don’t fully understand some of these fields now, that’s okay — as you continue learning, it will all become clear.

Summary

Since much of the dynamic information cannot be statically stored in the PyCodeObject, after it’s handed over to the virtual machine, the virtual machine dynamically constructs a PyFrameObject (stack frame) on top of it.

Thus, the virtual machine executes bytecode within the stack frame, which contains all the information the virtual machine needs to execute the bytecode.