简体   繁体   中英

How to mock boto3 paginator?

I have the following function that I need to test:

    def function_to_test(host: str, prefix: str, file_reg_ex=None, dir_reg_ex=None):
        s3_client = boto3.client('s3')
        s3_paginator = s3_client.get_paginator('list_objects')

        response_iterator = s3_paginator.paginate(
            Bucket=host,
            Prefix=prefix,
            PaginationConfig={
                'PageSize': 1000
            }
        )

        ret_dict = {}
        for page in response_iterator:
            for s3_object in page['Contents']:
                key = s3_object['Key']
                sections = str(key).rsplit('/', 1)
                key_dir = sections[0]
                file_name = sections[1]
                if (file_reg_ex is None or re.search(file_reg_ex, file_name)) and \
                        (dir_reg_ex is None or re.search(dir_reg_ex, key_dir)):
                    ret_dict[key] = {
                        'ETag': s3_object['ETag'],
                        'Last-Modified': s3_object['LastModified'].timestamp()
                    }

        return ret_dict

It looks like I need to use the boto stubber referenced here: https://botocore.amazonaws.com/v1/documentation/api/latest/reference/stubber.html#botocore-stub

In the documentation they make a response that is returned from a 'list-objects' S3 request but this will not work for a paginator as it returns a botocore.paginate.PageIterator object. How can this functionality be mocked?

It was suggested to look into https://pypi.org/project/boto3-mocking/ and https://github.com/spulec/moto but due to time constraints I did a more simple workaround.

    @staticmethod
    def get_s3_resp_iterator(host, prefix, s3_client):
        s3_paginator = s3_client.get_paginator('list_objects')
        return s3_paginator.paginate(
            Bucket=host,
            Prefix=prefix,
            PaginationConfig={
                'PageSize': 1000
            }
        )


def function_to_test(host: str, prefix: str, file_reg_ex=None, dir_reg_ex=None):
        s3_client = boto3.client('s3')
        s3_paginator = s3_client.get_paginator('list_objects')
        response_iterator = self.get_s3_resp_iterator(host, prefix, s3_client)        

        ret_dict = {}
        for page in response_iterator:
            for s3_object in page['Contents']:
                key = s3_object['Key']
                sections = str(key).rsplit('/', 1)
                key_dir = sections[0]
                file_name = sections[1]
                if (file_reg_ex is None or re.search(file_reg_ex, file_name)) and \
                        (dir_reg_ex is None or re.search(dir_reg_ex, key_dir)):
                    ret_dict[key] = {
                        'ETag': s3_object['ETag'],
                        'Last-Modified': s3_object['LastModified'].timestamp()
                    }

        return ret_dict

This allows me to do the following in a pretty straight forward manner:

    def test_s3(self):
        test_resp_iter = [
            {
                'Contents': [
                    {
                        'Key': 'key/key1',
                        'ETag': 'etag1',
                        'LastModified': datetime.datetime(2020, 8, 14, 17, 19, 34, tzinfo=tzutc())
                    },
                    {
                        'Key': 'key/key2',
                        'ETag': 'etag2',
                        'LastModified': datetime.datetime(2020, 8, 14, 17, 19, 34, tzinfo=tzutc())
                    }
                ]
            }
        ]
        tc = TestClass()
        tc.get_s3_resp_iterator = MagicMock(return_value=test_resp_iter)
        ret_dict = tc.function_s3('test_host', '', file_reg_ex=None, dir_reg_ex=None)
        self.assertEqual(len(ret_dict), 2)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM