I'm testing some dynamodb access code. In the past incorrect handling of pagination has caused bugs (developers tend to manually test with small amounts of data, so it's easy to make incorrect assumptions about how pagination works that only come to light once realistic data volumes are being handled)
I've typically unit tested the access code using plain unittest
and unittest.mock
and have tested pagination in this way, but I've ended up writing some reasonably complex test code to simulate pagination for different operations (scan, query, batch_get_item).
I'm looking for a simpler way of testing this; moto offers some hope
However, I don't really want to load 1MB+ of data into moto to induce pagination, I want to force it to paginate a tiny amount of data
So the crux of what I'm asking is:
References
Does moto support DynamoDB pagination at all?
Yes it does via the moto.mock_dynamodb2
functionality. I have tried pagination using PynamoDB's query
functionality and it works fine on my mocked DynamoDB environment provided by moto.mock_dynamodb2
.
Can I configure the pagination threshold?
By using the PynamoDB's query
, you can configure it in the limit
parameter.
Pagination has this core concepts:
hash_key
+ range_key_condition
+ filter_condition
limit
scan_index_forward
last_evaluated_key
[0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50]
. If we paginate 4 items from the start, we would get [0, 5, 10, 15]
. If we want to get the next 4 items, we don't need to iterate all the way from start ( 0
) to the target ( 20
onwards). Such algorithm would have resulted to linear O(n) time complexity on worst case scenario where n is the count of all records. Instead, what we can do is perform a binary search on the first item that is greater than the last fetched item (which was 15
) where we would get 20
just in logarithmic O(log(n)).How?
See Python code snippet
# Testing date: 2020 9September 29
# Versions
# moto==1.3.16
# pynamodb==4.3.3
# pytest==6.1.0
import itertools
from moto import mock_dynamodb2
from pynamodb.attributes import *
from pynamodb.models import Model
import pytest
# Model
class Location(Model):
class Meta:
table_name = 'Location-table'
region = 'ap-southeast-1'
continent = UnicodeAttribute(hash_key=True) # also known as partition_key
country = UnicodeAttribute(range_key=True) # also known as sort_key
capital = UnicodeAttribute()
gmt = NumberAttribute()
def __iter__(self):
for name, attr in self.get_attributes().items():
yield name, attr.serialize(getattr(self, name))
# Test data
LOCATIONS = [
{
'continent': 'Europe',
'country': 'Spain',
'capital': 'Madrid',
'gmt': 2,
},
{
'continent': 'Europe',
'country': 'Germany',
'capital': 'Berlin',
'gmt': 2,
},
{
'continent': 'South America',
'country': 'Venezuela',
'capital': 'Caracas',
'gmt': -4,
},
{
'continent': 'Europe',
'country': 'Ukraine',
'capital': 'Kyiv',
'gmt': 3,
},
{
'continent': 'South America',
'country': 'Brazil',
'capital': 'Brasília',
'gmt': -3,
},
{
'continent': 'Europe',
'country': 'Finland',
'capital': 'Helsinki',
'gmt': 3,
},
{
'continent': 'Europe',
'country': 'Ireland',
'capital': 'Dublin',
'gmt': 1,
},
]
# Test algorithms
def _setup_table(locations):
Location.create_table()
for location in locations:
Location(**location).save()
def _get_filter_condition():
# Put logic here for the filter condition. Uncomment the code below to try.
# filter_condition = (Location.gmt >= 2) \
# & (Location.capital.contains('in') | Location.capital.startswith('A'))
# return filter_condition
return None
@mock_dynamodb2
def test_dynamodb_pagination():
_setup_table(LOCATIONS)
filter_condition = _get_filter_condition()
# Expected query order for Europe. This should be sorted by country (which is the sort_key field).
SORTED_EUROPE_COUNTRIES = [
'Finland',
'Germany',
'Ireland',
'Spain',
'Ukraine',
]
country_index = 0
# This indicates the last processed item (for the key) from the database. This marks that item
# as the reference point to where the next set of items will be fetched. None means query from
# the beginning of the sorted records. Otherwise, start the query from the indicated key.
last_evaluated_key = None
for query_index in itertools.count(0):
result = Location.query(
hash_key='Europe',
filter_condition=filter_condition, # Filter the query results
limit=2, # Maximum number of items to fetch from the database
last_evaluated_key=last_evaluated_key, # The reference starting point of the fetch
scan_index_forward=True, # Indicate if in lexicographical order (increasing) or in reverse (decreasing)
)
for item in result:
print(f"Query #{query_index} - Country #{country_index} - {item}")
assert item.country == SORTED_EUROPE_COUNTRIES[country_index]
country_index += 1
print(f"result.last_evaluated_key {result.last_evaluated_key}\n")
last_evaluated_key = result.last_evaluated_key
if last_evaluated_key is None:
print(f"Reached the last queried item in the database")
break
Output:
(venv) nponcian 2020_9Sep_10_DynamoDB$ pytest pagination_test.py -rP
====================================================================================== test session starts ======================================================================================
platform linux -- Python 3.8.2, pytest-6.1.0, py-1.9.0, pluggy-0.13.1
rootdir: /home/nponcian/Documents/Program/2020_9Sep_10_DynamoDB
plugins: cov-2.10.1, mock-3.3.1
collected 1 item
pagination_test.py . [100%]
============================================================================================ PASSES =============================================================================================
___________________________________________________________________________________ test_dynamodb_pagination ____________________________________________________________________________________
------------------------------------------------------------------------------------- Captured stdout call --------------------------------------------------------------------------------------
Query #0 - Country #0 - Location-table<Europe, Finland>
Query #0 - Country #1 - Location-table<Europe, Germany>
result.last_evaluated_key {'continent': {'S': 'Europe'}, 'country': {'S': 'Germany'}}
Query #1 - Country #2 - Location-table<Europe, Ireland>
Query #1 - Country #3 - Location-table<Europe, Spain>
result.last_evaluated_key {'continent': {'S': 'Europe'}, 'country': {'S': 'Spain'}}
Query #2 - Country #4 - Location-table<Europe, Ukraine>
result.last_evaluated_key None
Reached the last queried item in the database
======================================================================================= 1 passed in 0.40s =======================================================================================
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.