简体   繁体   中英

How can you simulate DynamoDB pagination using moto?

I'm testing some dynamodb access code. In the past incorrect handling of pagination has caused bugs (developers tend to manually test with small amounts of data, so it's easy to make incorrect assumptions about how pagination works that only come to light once realistic data volumes are being handled)

I've typically unit tested the access code using plain unittest and unittest.mock and have tested pagination in this way, but I've ended up writing some reasonably complex test code to simulate pagination for different operations (scan, query, batch_get_item).

I'm looking for a simpler way of testing this; moto offers some hope

However, I don't really want to load 1MB+ of data into moto to induce pagination, I want to force it to paginate a tiny amount of data

So the crux of what I'm asking is:

  • Does moto support DynamoDB pagination at all?
  • Can I configure the pagination threshold?
  • How?

References

Does moto support DynamoDB pagination at all?

Yes it does via the moto.mock_dynamodb2 functionality. I have tried pagination using PynamoDB's query functionality and it works fine on my mocked DynamoDB environment provided by moto.mock_dynamodb2 .

Can I configure the pagination threshold?

By using the PynamoDB's query , you can configure it in the limit parameter.

Pagination has this core concepts:

  1. hash_key + range_key_condition + filter_condition
  • The list of DynamoDB records that you want to paginate
  1. limit
  • The maximum number of results returned by the query
  1. scan_index_forward
  • The order of the results. Either you want the fetched records sorted by the range_key / sort_key in ascending order (eg 1, 2, 3) or descending order (eg 3, 2, 1)
  1. last_evaluated_key
  • This indicates the last processed item (for the key) from the database. This marks that item as the reference point to where the next set of items will be fetched. None means query from the beginning of the sorted records. Otherwise, start the query from the indicated key.
  • Think of this like a binary search on [0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50] . If we paginate 4 items from the start, we would get [0, 5, 10, 15] . If we want to get the next 4 items, we don't need to iterate all the way from start ( 0 ) to the target ( 20 onwards). Such algorithm would have resulted to linear O(n) time complexity on worst case scenario where n is the count of all records. Instead, what we can do is perform a binary search on the first item that is greater than the last fetched item (which was 15 ) where we would get 20 just in logarithmic O(log(n)).

How?

See Python code snippet

# Testing date: 2020 9September 29

# Versions
# moto==1.3.16
# pynamodb==4.3.3
# pytest==6.1.0

import itertools

from moto import mock_dynamodb2
from pynamodb.attributes import *
from pynamodb.models import Model
import pytest


# Model

class Location(Model):
    class Meta:
        table_name = 'Location-table'
        region = 'ap-southeast-1'

    continent = UnicodeAttribute(hash_key=True)  # also known as partition_key
    country = UnicodeAttribute(range_key=True)  # also known as sort_key
    capital = UnicodeAttribute()
    gmt = NumberAttribute()

    def __iter__(self):
        for name, attr in self.get_attributes().items():
            yield name, attr.serialize(getattr(self, name))


# Test data

LOCATIONS = [
    {
        'continent': 'Europe',
        'country': 'Spain',
        'capital': 'Madrid',
        'gmt': 2,
    },
    {
        'continent': 'Europe',
        'country': 'Germany',
        'capital': 'Berlin',
        'gmt': 2,
    },
    {
        'continent': 'South America',
        'country': 'Venezuela',
        'capital': 'Caracas',
        'gmt': -4,
    },
    {
        'continent': 'Europe',
        'country': 'Ukraine',
        'capital': 'Kyiv',
        'gmt': 3,
    },
    {
        'continent': 'South America',
        'country': 'Brazil',
        'capital': 'Brasília',
        'gmt': -3,
    },
    {
        'continent': 'Europe',
        'country': 'Finland',
        'capital': 'Helsinki',
        'gmt': 3,
    },
    {
        'continent': 'Europe',
        'country': 'Ireland',
        'capital': 'Dublin',
        'gmt': 1,
    },
]


# Test algorithms

def _setup_table(locations):
    Location.create_table()

    for location in locations:
        Location(**location).save()

def _get_filter_condition():
    # Put logic here for the filter condition. Uncomment the code below to try.
    # filter_condition = (Location.gmt >= 2) \
    #                     & (Location.capital.contains('in') | Location.capital.startswith('A'))
    # return filter_condition
    return None


@mock_dynamodb2
def test_dynamodb_pagination():
    _setup_table(LOCATIONS)
    filter_condition = _get_filter_condition()

    # Expected query order for Europe. This should be sorted by country (which is the sort_key field).
    SORTED_EUROPE_COUNTRIES = [
        'Finland',
        'Germany',
        'Ireland',
        'Spain',
        'Ukraine',
    ]
    country_index = 0

    # This indicates the last processed item (for the key) from the database. This marks that item
    # as the reference point to where the next set of items will be fetched. None means query from
    # the beginning of the sorted records. Otherwise, start the query from the indicated key.
    last_evaluated_key = None

    for query_index in itertools.count(0):
        result = Location.query(
            hash_key='Europe',
            filter_condition=filter_condition,  # Filter the query results
            limit=2,  # Maximum number of items to fetch from the database
            last_evaluated_key=last_evaluated_key,  # The reference starting point of the fetch
            scan_index_forward=True,  # Indicate if in lexicographical order (increasing) or in reverse (decreasing)
        )

        for item in result:
            print(f"Query #{query_index} - Country #{country_index} - {item}")

            assert item.country == SORTED_EUROPE_COUNTRIES[country_index]
            country_index += 1

        print(f"result.last_evaluated_key {result.last_evaluated_key}\n")
        last_evaluated_key = result.last_evaluated_key

        if last_evaluated_key is None:
            print(f"Reached the last queried item in the database")
            break

Output:

(venv) nponcian 2020_9Sep_10_DynamoDB$ pytest pagination_test.py -rP
====================================================================================== test session starts ======================================================================================
platform linux -- Python 3.8.2, pytest-6.1.0, py-1.9.0, pluggy-0.13.1
rootdir: /home/nponcian/Documents/Program/2020_9Sep_10_DynamoDB
plugins: cov-2.10.1, mock-3.3.1
collected 1 item                                                                                                                                                                                

pagination_test.py .                                                                                                                                                                      [100%]

============================================================================================ PASSES =============================================================================================
___________________________________________________________________________________ test_dynamodb_pagination ____________________________________________________________________________________
------------------------------------------------------------------------------------- Captured stdout call --------------------------------------------------------------------------------------
Query #0 - Country #0 - Location-table<Europe, Finland>
Query #0 - Country #1 - Location-table<Europe, Germany>
result.last_evaluated_key {'continent': {'S': 'Europe'}, 'country': {'S': 'Germany'}}

Query #1 - Country #2 - Location-table<Europe, Ireland>
Query #1 - Country #3 - Location-table<Europe, Spain>
result.last_evaluated_key {'continent': {'S': 'Europe'}, 'country': {'S': 'Spain'}}

Query #2 - Country #4 - Location-table<Europe, Ukraine>
result.last_evaluated_key None

Reached the last queried item in the database
======================================================================================= 1 passed in 0.40s =======================================================================================

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM