使用 python 中的序列號從 Azure EventHub 使用事件

Question

我已將多個文件發布到事件中心，出於另一個目的，我想從事件中心下載特定文件。 我有文件名和序列號。

我用過這個方法

await client.receive(on_event=on_event,  starting_position="12856854")

這是從位置12856854下載所有文件。

但我只想下載一個特定的文件。

舉個例子，我發布了sample_data.xml ，它的序列號是567890

我在這里需要的是我想從事件中心下載sample_data.xml文件。

Answer 1

從您提到的代碼行開始，starting_position 將為我們提供分區的開始。 為了從那個特定的點開始，它被提到如下：

await client.receive(
on_event=on_event,
starting_position="-1", 
)

本節下方的腳本從您的 Azure 存儲帳戶中讀取捕獲的數據文件，並生成 CSV 文件供您輕松打開和查看。

import os
import string
import json
import uuid
import avro.schema

from azure.storage.blob import ContainerClient, BlobClient
from avro.datafile import DataFileReader, DataFileWriter
from avro.io import DatumReader, DatumWriter


def processBlob2(filename):
    reader = DataFileReader(open(filename, 'rb'), DatumReader())
    dict = {}
    for reading in reader:
        parsed_json = json.loads(reading["Body"])
        if not 'id' in parsed_json:
            return
        if not parsed_json['id'] in dict:
            list = []
            dict[parsed_json['id']] = list
        else:
            list = dict[parsed_json['id']]
            list.append(parsed_json)
    reader.close()
    for device in dict.keys():
        filename = os.getcwd() + '\\' + str(device) + '.csv'
        deviceFile = open(filename, "a")
        for r in dict[device]:
            deviceFile.write(", ".join([str(r[x]) for x in r.keys()])+'\n')

def startProcessing():
    print('Processor started using path: ' + os.getcwd())
    # Create a blob container client.
    container = ContainerClient.from_connection_string("AZURE STORAGE CONNECTION STRING", container_name="BLOB CONTAINER NAME")
    blob_list = container.list_blobs() # List all the blobs in the container.
    for blob in blob_list:
        # Content_length == 508 is an empty file, so process only content_length > 508 (skip empty files).        
        if blob.size > 508:
            print('Downloaded a non empty blob: ' + blob.name)
            # Create a blob client for the blob.
            blob_client = ContainerClient.get_blob_client(container, blob=blob.name)
            # Construct a file name based on the blob name.
            cleanName = str.replace(blob.name, '/', '_')
            cleanName = os.getcwd() + '\\' + cleanName 
            with open(cleanName, "wb+") as my_file: # Open the file to write. Create it if it doesn't exist. 
                my_file.write(blob_client.download_blob().readall()) # Write blob contents into the file.
            processBlob2(cleanName) # Convert the file into a CSV file.
            os.remove(cleanName) # Remove the original downloaded file.
            # Delete the blob from the container after it's read.
            container.delete_blob(blob.name)

startProcessing()

有關過程和更多信息，請參閱MS Docs 。

使用 python 中的序列號從 Azure EventHub 使用事件

問題描述

1 個解決方案

解決方案1
0 2021-11-23 07:08:56

使用 python 中的序列號從 Azure EventHub 使用事件

問題描述

1 個解決方案

解決方案1 0 2021-11-23 07:08:56

解決方案1
0 2021-11-23 07:08:56