將 json 保存到 Azure Data Lake Storage Gen 2 中的文件

Question

在 Databricks 中，使用 Python，我正在使用 requests 庫發出獲取請求，響應是 json。

以下是獲取請求的示例：

json_data = requests.get("https://prod-noblehire-api-000001.appspot.com/job?").json()

我想將 json_data 變量保存為 Azure Data Lake Storage 中的文件。 我不想先將其讀入 Pyspark/Pandas DataFrame 然后保存。

如果我將它保存到計算機上的本地文件夾中，我會使用以下代碼：

j = json.dumps(json_data)
with open("MyJsonFile.json", "w") as f:
    f.write(j)
    f.close()

但是，由於我想將其保存在 Azure 數據湖存儲中，因此根據 Microsoft 的文檔，我應該使用以下內容：

def upload_file_to_directory():
    try:

        file_system_client = service_client.get_file_system_client(file_system="my-file-system")

        directory_client = file_system_client.get_directory_client("my-directory")
        
        file_client = directory_client.create_file("uploaded-file.txt")
        local_file = open("C:\\file-to-upload.txt",'r')

        file_contents = local_file.read()

        file_client.append_data(data=file_contents, offset=0, length=len(file_contents))

        file_client.flush_data(len(file_contents))

    except Exception as e:
      print(e)

如何結合這兩段代碼將變量保存為 ADLS 中的文件？ 另外，有沒有更好的方法來做到這一點？

Answer 1

您實際上不必在本地保存。 相反，您可以掛載您的 ADLS 存儲帳戶，然后將所需的 JSON 內容寫入其中。 下面是對我有用的代碼。

import requests
import json

json_data = requests.get("<YOUR_URL>").json()
j = json.dumps(json_data)
with open("/<YOUR_MOUNT_POINT>/<FILE_NAME>.json", "w") as f:
    f.write(j)
    f.close()

在此處輸入圖像描述

將 json 保存到 Azure Data Lake Storage Gen 2 中的文件

問題描述

1 個解決方案

解決方案1
0 2022-09-15 08:08:29

將 json 保存到 Azure Data Lake Storage Gen 2 中的文件

問題描述

1 個解決方案

解決方案1 0 2022-09-15 08:08:29

解決方案1
0 2022-09-15 08:08:29