简体   繁体   中英

How to use Boto3 pagination

BACKGROUND:

The AWS operation to list IAM users returns a max of 50 by default.

Reading the docs (links) below I ran following code and returned a complete set data by setting the "MaxItems" to 1000.

paginator = client.get_paginator('list_users')
response_iterator = paginator.paginate(
 PaginationConfig={
     'MaxItems': 1000,
     'PageSize': 123})
for page in response_iterator:
    u = page['Users']
    for user in u:
        print(user['UserName'])

http://boto3.readthedocs.io/en/latest/guide/paginators.html https://boto3.readthedocs.io/en/latest/reference/services/iam.html#IAM.Paginator.ListUsers

QUESTION:

If the " MaxItems " was set to 10, for example, what would be the best method to loop through the results?

I tested with the following but it only loops 2 iterations before 'IsTruncated' == False and results in "KeyError: 'Marker'". Not sure why this is happening because I know there are over 200 results.

marker = None

while True:
    paginator = client.get_paginator('list_users')
    response_iterator = paginator.paginate( 
        PaginationConfig={
            'MaxItems': 10,
            'StartingToken': marker})
    #print(response_iterator)
    for page in response_iterator:
        u = page['Users']
        for user in u:
            print(user['UserName'])
            print(page['IsTruncated'])
            marker = page['Marker']
            print(marker)
        else:
            break

(Answer rewrite) **NOTE **, the paginator contains a bug that doesn't tally with the documentation (or vice versa). MaxItems doesn't return the Marker or NextToken when total items exceed MaxItems number. Indeed PageSize is the one that controlling return of Marker/NextToken indictator.

import sys
import boto3
iam = boto3.client("iam")
marker = None
while True:
    paginator = iam.get_paginator('list_users')
    response_iterator = paginator.paginate( 
        PaginationConfig={
            'PageSize': 10,
            'StartingToken': marker})
    for page in response_iterator:
        print("Next Page : {} ".format(page['IsTruncated']))
        u = page['Users']
        for user in u:
            print(user['UserName'])
    try:
        marker = response_iterator['Marker']
        print(marker)
    except KeyError:
        sys.exit()

It is not your mistake that your code doesn't works. MaxItems in the paginator seems become a "threshold" indicator. Ironically, the MaxItems inside original boto3.iam.list_users still works as mentioned.

If you check boto3.iam.list_users, you will notice either you omit Marker , otherwise you must put a value. Apparently, paginator is NOT a wrapper for all boto3 class list_* method.

import sys
import boto3
iam = boto3.client("iam")
marker = None
while True:
    if marker:
        response_iterator = iam.list_users(
            MaxItems=10,
            Marker=marker
        )
    else:
        response_iterator = iam.list_users(
            MaxItems=10
        )
    print("Next Page : {} ".format(response_iterator['IsTruncated']))
    for user in response_iterator['Users']:
        print(user['UserName'])

    try:
        marker = response_iterator['Marker']
        print(marker)
    except KeyError:
        sys.exit()

You can follow upthe issue I filed in boto3 github . According to the member, you can call build_full_result after paginate(), that will show the desire behavior.

This post is pretty old but due to the lack of concise documetation I want to share my code for all of those that are struggling with this

Here are two simple examples of how I solved it using Boto3's paginator hoping this helps you understand how it works

Boto3 official pagination documentation: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/paginators.html

AWS API specifying that the first token should be $null (None in Python): https://docs.aws.amazon.com/powershell/latest/reference/items/Get-SSMParametersByPath.html

Examples:

First example with little complexity for people like me who struggled to understand how this works:

def read_ssm_parameters():
    page_iterator = paginator.paginate(
        Path='path_to_the_parameters',
        Recursive=True,
        PaginationConfig={
        'MaxItems': 10,
        'PageSize': 10,
        }
    )

    while myNextToken:
        for page in page_iterator:
             print('# This is new page')
             print(page['Parameters'])
             if 'NextToken' in page.keys():
                 print(page['NextToken'])
                 myNextToken=page['NextToken']
             else:
                 myNextToken=False

    page_iterator = paginator.paginate(
        Path=baseSSMPath,
        Recursive=True,
        PaginationConfig={
            'MaxItems': 10,
            'PageSize': 10,
            'StartingToken': myNextToken
        }
    )

Second example with reduced code but without the complexity of using recursion

def read_ssm_parameters(myNextToken='None'):
    while myNextToken:
        page_iterator = paginator.paginate(
            Path='path_to_the_parameters',
            Recursive=True,
            PaginationConfig={
                'MaxItems': 10,
                'PageSize': 10,
                'StartingToken': myNextToken
            }
        )

        for page in page_iterator:
            if 'NextToken' in page.keys():
                print('# This is a new page')
                myNextToken=page['NextToken']
                print(page['Parameters'])
            else:
                # Exit if there are no more pages to read
                myNextToken=False

Hope this helps!

This code wasn't working for me. It always drops off the remainder of the items on the last page and doesn't include them in the results. Gives me a result of 60 accounts when I know I have 68. That last result page doesn't get appended to my list of account UserName's. I have concerns that these above examples are doing the same thing and people aren't noticing this in the results.

That and it seems overly complex to paginate through with an arbitrary size for what purpose?

This should be simple and gives you a complete listing.

import boto3
iam = boto3.client("iam")
paginator = iam.get_paginator('list_users')
response_iterator = paginator.paginate()
accounts=[]
for page in response_iterator:
    for user in page['Users']:
        accounts.append(user['UserName'])
len(accounts)
68

I will post my solution here and hopefully help other people do their job faster instead of fiddling around with the amazingly written boto3 api calls.

My use case was to list all the Security Hub ControlIds using the SecurityHub.Client.describe_standards_controls function.


controlsResponse = sh_client.describe_standards_controls(
StandardsSubscriptionArn = enabledStandardSubscriptionArn)

controls = controlsResponse.get('Controls')

# This is the token for the 101st item in the list.
nextToken = controlsResponse.get('NextToken') 

# Call describe_standards_controls with the token set at item 101 to get the next 100 results 
controlsResponse1 = sh_client.describe_standards_controls(
StandardsSubscriptionArn = enabledStandardSubscriptionArn, NextToken=nextToken)

controls1 = controlsResponse1.get('Controls')

# And make the two lists into one
controls.extend(controls1)

No you have a list of all the SH standards controls for the specified Subscription Standard(eg, AWS foundational Standard)

For example if I want to get all the ControlIds I can just iterate the 'controls' list and do

controlId=control.get("ControlId")

same for other field in the response as it is described here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM