When performing pagination in database queries, it's essential to return a token that can be used directly to locate the first data item of the next page. It is not recommended to simply use the LIMIT
and OFFSET
as a page token, as this method can lead to inefficient queries with large datasets.
Overview
The core idea is to improve the efficiency of pagination queries on large datasets. When dealing with vast amounts of data, using the LIMIT
and OFFSET
clauses can cause performance issues. This is because the database must skip the number of rows specified by the OFFSET
to retrieve the required data. As the OFFSET
value increases, the query performance can degrade.
To solve this problem, a cursor-based or "Seek Method" pagination approach can be used. Instead of skipping rows, this method tracks the unique key of the last record from the previous page (usually a primary key or unique index) and queries for the next set of sorted records.
Here is a simple example assuming we have a users
table with an auto-incremented id
field as the primary key.
First, we need an API that doesn't accept a page number but instead accepts a last_seen_id
(the ID of the last user from the previous page) as a parameter to fetch the next page of data.
Code Example
Below is an example using Flask to create a pagination API. For simplicity, database connection details are omitted:
from flask import Flask, request, jsonify app = Flask(__name__) # Assuming we have a database query function def query_db(query, args=(), one=False): # This should be the code to execute the query using libraries like pymysql or sqlalchemy # This is just for illustration, actual implementation is omitted pass # Pagination API @app.route('/users', methods=['GET']) def get_users(): # Default page size and maximum limit default_page_size = 10 max_page_size = 100 # Get last_seen_id and page_size from the request last_seen_id = request.args.get('last_seen_id', None) page_size = int(request.args.get('page_size', default_page_size)) # Ensure page_size doesn't exceed the maximum limit if page_size > max_page_size: page_size = max_page_size # Create query using last_seen_id to locate the next page of data if last_seen_id: query = "SELECT * FROM users WHERE id > %s ORDER BY id ASC LIMIT %s" args = (last_seen_id, page_size) else: # If last_seen_id is not provided, return the first page query = "SELECT * FROM users ORDER BY id ASC LIMIT %s" args = (page_size,) # Execute query users = query_db(query, args) # Convert query result to list of dictionaries # Assuming each user is a tuple (id, name) users_list = [{'id': user[0], 'name': user[1]} for user in users] last_seen_id = users[-1][0] if users else None # Return JSON response return jsonify({ 'users': users_list, 'last_seen_id': last_seen_id }) # Start the Flask app if __name__ == '__main__': app.run(debug=True)
With this API, the client can specify the last_seen_id
parameter to get the next page of data. For example, /users?last_seen_id=100&page_size=20
will fetch the next 20 user records with IDs greater than 100.
Benefits of This Method
The advantage of this approach is that, regardless of whether you are fetching the first page or the one-millionth page, the query efficiency remains the same. This is because the database always starts querying from the last ID of the previous page, without needing to skip any rows. This significantly improves query performance, especially when handling large datasets.