[英]How to solve directory path problem recursively in Python?
我正在執行 REST API 調用以獲取 SharePoint 文檔庫的文件夾。
我想遞歸獲取整個目錄樹中的所有文件夾路徑。
我編寫了一個函數來從給定文件夾中獲取子文件夾列表,但我不確定如何遍歷到第 N 個目錄並獲取所有文件夾路徑。
例如,假設當前的 SharePoint 文檔庫結構如下 JSON(fo=folder;f=file):
{
"root": [
{
"fo1": {
"fo1": "f1",
"fo2": ["f1", "f2"]
},
"fo2": ["fi1", "fi2"]
},
"fi1","fi2"]
}
從上面的例子中,我想要一個所有文件夾/目錄的路徑列表:例如輸出應該是:
["/root/fo1/", "/root/fo1/fo1/", "/root/fo1/fo2/", "/root/fo2/"]
因為它是一個 REST API 調用,所以我不知道結構,直到我運行 get 子文件夾的查詢,然后進入每個子文件夾以獲取它們各自的子文件夾。
我編寫的當前(以下)函數正在讓我將數據降到 1 級(子文件夾,因為它基於內部迭代而不是遞歸),我如何實現基於遞歸的解決方案以獲取所有唯一文件夾路徑作為列表?
def print_root_contents(ctx):
try:
list_object = ctx.web.lists.get_by_title('Documents')
folder = list_object.root_folder
ctx.load(folder)
ctx.execute_query()
folders = folder.folders
ctx.load(folders)
ctx.execute_query()
for myfolder in folders:
print("For Folder : {0}".format(myfolder.properties["Name"]))
folder_list, files_list = print_folder_contents(ctx, myfolder.properties["Name"])
print("Sub folders - ", folder_list)
print("Files - ", files_list)
except Exception as e:
print('Problem printing out library contents: ', e)
def print_folder_contents(ctx, folder_name):
try:
folder = ctx.web.get_folder_by_server_relative_url("/sites/abc/Shared Documents/"+folder_name+"/")
ctx.load(folder)
ctx.execute_query()
# Folders
fold_names = []
sub_folders = folder.folders
ctx.load(sub_folders)
ctx.execute_query()
for s_folder in sub_folders:
# folder_name = folder_name+"/"+s_folder.properties["Name"]
# print("Folder name: {0}".format(folder.properties["Name"]))
fold_names.append(s_folder.properties["Name"])
return fold_names
except Exception as e:
print('Problem printing out library contents: ', e)
在上面的最后一個函數 (print_folder_contents) 中,我無法形成遞歸邏輯來保持遞歸附加文件夾和子文件夾,並在第 n 個文件夾中沒有更多文件夾時停止它並繼續上一級的下一個同級文件夾。
發現它真的很有挑戰性。 有什么幫助嗎?
您可以使用一個生成器函數,它遍歷 dict 項並生成 dict 鍵和與從遞歸調用生成的路徑連接的生成鍵,如果給定一個列表,則遞歸生成從列表項上的遞歸調用生成的內容:
def paths(d):
def _paths(d):
if isinstance(d, dict):
for k, v in d.items():
yield k + '/'
for p in _paths(v):
yield '/'.join((k, p))
elif isinstance(d, list):
for i in d:
yield from _paths(i)
return ['/' + p for p in _paths(d)]
所以給出:
d = {
"root": [
{
"fo1": {
"fo1": "f1",
"fo2": ["f1", "f2"]
},
"fo2": ["fi1", "fi2"]
},
"fi1","fi2"]
}
paths(d)
返回:
['/root/', '/root/fo1/', '/root/fo1/fo1/', '/root/fo1/fo2/', '/root/fo2/']
請注意,您的預期輸出應包含'/root/'
因為根文件夾也應該是有效文件夾。
我知道這個答案對游戲來說已經很晚了,但是您可以執行如下操作以獲取給定某個父目錄的所有子 SharePoint 對象的平面列表。
這是有效的,因為我們不斷擴展單個列表,而不是在list.append()
某些目錄樹時利用list.append()
方法時創建嵌套對象。
我相信會有機會改進以下代碼段,但我相信這應該可以幫助您實現目標。
干杯,
rs311
from office365.sharepoint.client_context import ClientContext
def get_items_in_directory(ctx_client: ClientContext,
directory_relative_uri: str,
recursive: bool = True):
"""
This function provides a way to get all items in a directory in SharePoint, with
the option to traverse nested directories to extract all child objects.
:param ctx_client: office365.sharepoint.client_context.ClientContext object
SharePoint ClientContext object.
:param directory_relative_uri: str
Path to directory in SharePoint.
:param recursive: bool
default = False
Tells function whether or not to perform a recursive call.
:return: list
Returns a flattened array of all child file and/or folder objects
given some parent directory. All items will be of the following types:
- office365.sharepoint.file.File
- office365.sharepoint.folder.Folder
Examples
---------
All examples assume you've already authenticated with SharePoint per
documentation found here:
- https://github.com/vgrem/Office365-REST-Python-Client#examples
Assumed directory structure:
some_directory/
my_file.csv
your_file.xlsx
sub_directory_one/
123.docx
abc.csv
sub_directory_two/
xyz.xlsx
directory = 'some_directory'
# Non-recursive call
extracted_child_objects = get_items_in_directory(directory)
# extracted_child_objects would contain (my_file.csv, your_file.xlsx, sub_directory_one/, sub_directory_two/)
# Recursive call
extracted_child_objects = get_items_in_directory(directory, recursive=True)
# extracted_child_objects would contain (my_file.csv, your_file.xlsx, sub_directory_one/, sub_directory_two/, sub_directory_one/123.docx, sub_directory_one/abc.csv, sub_directory_two/xyz.xlsx)
"""
contents = list()
folders = ctx_client.web.get_folder_by_server_relative_url(directory_relative_uri).folders
ctx_client.load(folders)
ctx_client.execute_query()
if recursive:
for folder in folders:
contents.extend(
get_items_in_directory(
ctx_client=ctx_client,
directory_relative_uri=folder.properties['ServerRelativeUrl'],
recursive=recursive)
)
contents.extend(folders)
files = ctx_client.web.get_folder_by_server_relative_url(directory_relative_uri).files
ctx_client.load(files)
ctx_client.execute_query()
contents.extend(files)
return contents
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.