如何在Python中遞歸解決目錄路徑問題？

Question

我正在執行 REST API 調用以獲取 SharePoint 文檔庫的文件夾。

我想遞歸獲取整個目錄樹中的所有文件夾路徑。

我編寫了一個函數來從給定文件夾中獲取子文件夾列表，但我不確定如何遍歷到第 N 個目錄並獲取所有文件夾路徑。

例如，假設當前的 SharePoint 文檔庫結構如下 JSON（fo=folder；f=file）：

{
  "root": [
    {
      "fo1": {
        "fo1": "f1",
        "fo2": ["f1", "f2"]
      },
      "fo2": ["fi1", "fi2"]
    },
    "fi1","fi2"]
}

從上面的例子中，我想要一個所有文件夾/目錄的路徑列表：例如輸出應該是：

["/root/fo1/", "/root/fo1/fo1/", "/root/fo1/fo2/", "/root/fo2/"]

因為它是一個 REST API 調用，所以我不知道結構，直到我運行 get 子文件夾的查詢，然后進入每個子文件夾以獲取它們各自的子文件夾。

我編寫的當前（以下）函數正在讓我將數據降到 1 級（子文件夾，因為它基於內部迭代而不是遞歸），我如何實現基於遞歸的解決方案以獲取所有唯一文件夾路徑作為列表？

def print_root_contents(ctx):

    try:
        list_object = ctx.web.lists.get_by_title('Documents')
        folder = list_object.root_folder
        ctx.load(folder)
        ctx.execute_query()

        folders = folder.folders
        ctx.load(folders)
        ctx.execute_query()

        for myfolder in folders:
            print("For Folder : {0}".format(myfolder.properties["Name"]))
            folder_list, files_list = print_folder_contents(ctx, myfolder.properties["Name"])
            print("Sub folders - ", folder_list)
            print("Files - ", files_list)

    except Exception as e:
        print('Problem printing out library contents: ', e)


def print_folder_contents(ctx, folder_name):

    try:
        folder = ctx.web.get_folder_by_server_relative_url("/sites/abc/Shared Documents/"+folder_name+"/")
        ctx.load(folder)
        ctx.execute_query()

        # Folders
        fold_names = []
        sub_folders = folder.folders
        ctx.load(sub_folders)
        ctx.execute_query()
        for s_folder in sub_folders:
            # folder_name = folder_name+"/"+s_folder.properties["Name"]
            # print("Folder name: {0}".format(folder.properties["Name"]))
            fold_names.append(s_folder.properties["Name"])

        return fold_names

    except Exception as e:
        print('Problem printing out library contents: ', e)

在上面的最后一個函數 (print_folder_contents) 中，我無法形成遞歸邏輯來保持遞歸附加文件夾和子文件夾，並在第 n 個文件夾中沒有更多文件夾時停止它並繼續上一級的下一個同級文件夾。

發現它真的很有挑戰性。 有什么幫助嗎？

Answer 1

您可以使用一個生成器函數，它遍歷 dict 項並生成 dict 鍵和與從遞歸調用生成的路徑連接的生成鍵，如果給定一個列表，則遞歸生成從列表項上的遞歸調用生成的內容：

def paths(d):
    def _paths(d):
        if isinstance(d, dict):
            for k, v in d.items():
                yield k + '/'
                for p in _paths(v):
                    yield '/'.join((k, p))
        elif isinstance(d, list):
            for i in d:
                yield from _paths(i)
    return ['/' + p for p in _paths(d)]

所以給出：

d = {
  "root": [
    {
      "fo1": {
        "fo1": "f1",
        "fo2": ["f1", "f2"]
      },
      "fo2": ["fi1", "fi2"]
    },
    "fi1","fi2"]
}

paths(d)返回：

['/root/', '/root/fo1/', '/root/fo1/fo1/', '/root/fo1/fo2/', '/root/fo2/']

請注意，您的預期輸出應包含'/root/'因為根文件夾也應該是有效文件夾。

Answer 2

我知道這個答案對游戲來說已經很晚了，但是您可以執行如下操作以獲取給定某個父目錄的所有子 SharePoint 對象的平面列表。

這是有效的，因為我們不斷擴展單個列表，而不是在list.append()某些目錄樹時利用list.append()方法時創建嵌套對象。

我相信會有機會改進以下代碼段，但我相信這應該可以幫助您實現目標。

干杯，

rs311

from office365.sharepoint.client_context import ClientContext


def get_items_in_directory(ctx_client: ClientContext,
                           directory_relative_uri: str,
                           recursive: bool = True):
    """
    This function provides a way to get all items in a directory in SharePoint, with
    the option to traverse nested directories to extract all child objects.
    
    :param ctx_client: office365.sharepoint.client_context.ClientContext object
        SharePoint ClientContext object.
    :param directory_relative_uri: str
        Path to directory in SharePoint. 
    :param recursive: bool
        default = False
        Tells function whether or not to perform a recursive call.
    :return: list
        Returns a flattened array of all child file and/or folder objects
        given some parent directory. All items will be of the following types:
            - office365.sharepoint.file.File
            - office365.sharepoint.folder.Folder
        
    Examples 
    ---------
    All examples assume you've already authenticated with SharePoint per
    documentation found here:
        - https://github.com/vgrem/Office365-REST-Python-Client#examples
        
    Assumed directory structure:
        some_directory/
            my_file.csv
            your_file.xlsx
            sub_directory_one/
                123.docx
                abc.csv
            sub_directory_two/
                xyz.xlsx
    
    directory = 'some_directory'
    # Non-recursive call
    extracted_child_objects = get_items_in_directory(directory)
    # extracted_child_objects would contain (my_file.csv, your_file.xlsx, sub_directory_one/, sub_directory_two/)
    
    
    # Recursive call
    extracted_child_objects = get_items_in_directory(directory, recursive=True)
    # extracted_child_objects would contain (my_file.csv, your_file.xlsx, sub_directory_one/, sub_directory_two/, sub_directory_one/123.docx, sub_directory_one/abc.csv, sub_directory_two/xyz.xlsx)
    
    """
    contents = list()
    folders = ctx_client.web.get_folder_by_server_relative_url(directory_relative_uri).folders
    ctx_client.load(folders)
    ctx_client.execute_query()

    if recursive:
        for folder in folders:
            contents.extend(
                get_items_in_directory(
                    ctx_client=ctx_client,
                    directory_relative_uri=folder.properties['ServerRelativeUrl'],
                    recursive=recursive)
            )

    contents.extend(folders)

    files = ctx_client.web.get_folder_by_server_relative_url(directory_relative_uri).files
    ctx_client.load(files)
    ctx_client.execute_query()

    contents.extend(files)
    return contents

如何在Python中遞歸解決目錄路徑問題？

問題描述

2 個解決方案

解決方案1
0 2019-04-05 06:49:50

解決方案2
0 2020-11-10 10:53:20

如何在Python中遞歸解決目錄路徑問題？

問題描述

2 個解決方案

解決方案1 0 2019-04-05 06:49:50

解決方案2 0 2020-11-10 10:53:20

解決方案1
0 2019-04-05 06:49:50

解決方案2
0 2020-11-10 10:53:20