簡體   English   中英

如何僅在事件中心接收最近的數據

[英]How to receive the recent data only in event hub

在 eventthub 中,我有兩個“發送者”和“接收者”腳本用於這兩者之間的通信。

我面臨的問題是,我似乎收到了我昨天發送的數據集以及我剛剛一起發送的數據集。 我試圖通過時間段或事件數來控制數據量。

sender.py 的基本代碼如下:


CONSUMER_GROUP = "$default"
OFFSET = Offset("-1")
PARTITION = "0"

total = 0
last_sn = -1
last_offset = "-1"
client = EventHubClient(ADDRESS, debug=False, username=USER, password=KEY)
try:
    receiver = client.add_receiver(
        CONSUMER_GROUP, PARTITION, prefetch=0, offset=OFFSET)
    client.run()
    start_time = time.time()
    batch = receiver.receive(timeout=100)

    for event_data in batch[-10:]:
        print("Received: {}".format(event_data.body_as_str(encoding='UTF-8')))
        total += 1

    end_time = time.time()
    client.stop()
    run_time = end_time - start_time
    print("Received {} messages in {} seconds".format(total, run_time))

except KeyboardInterrupt:
    pass
finally:
    client.stop()

我剛剛找到了一個使用偏移量來控制事件數據讀取過程的解決方案。

我們首先需要做的是獲取事件數據的偏移量。

如下代碼:

logger = logging.getLogger("azure")

ADDRESS = "amqps://xxx.servicebus.windows.net/xxx"
USER = "RootManageSharedAccessKey"
KEY = "xxx"

CONSUMER_GROUP = "$default"

#first, set offset to -1 to read all the event data
OFFSET = Offset("-1")
PARTITION = "0"

total = 0
last_sn = -1
last_offset = "-1"
client = EventHubClient(ADDRESS, debug=False, username=USER, password=KEY)
try:
    receiver = client.add_receiver(
        CONSUMER_GROUP, PARTITION, prefetch=5000, offset=OFFSET)
    client.run()
    start_time = time.time()
    print("**begin receive**")
    for event_data in receiver.receive(timeout=100):
        last_offset = event_data.offset.value
        last_sn = event_data.sequence_number
        #here, we print out the offset of each event data
        print("Received: {}, last_offset: {}, last_sn: {}".format(event_data.body_as_str(encoding='UTF-8'),last_offset,last_sn))        
        total += 1

    end_time = time.time()
    client.stop()
    run_time = end_time - start_time
    print("Received {} messages in {} seconds".format(total, run_time))

except KeyboardInterrupt:
    pass
finally:
    client.stop()

執行后,您可以看到每個數據的所有偏移量,截圖如下:

在此處輸入圖像描述

然后,您知道每個事件數據的偏移量。 如果你想從數字 40 到數字 53 獲取數據。數字 40 的偏移量是 237080,所以在你的代碼中,將偏移量更改為小於 237080 的值,在這行代碼中將其設置為 237079 OFFSET = Offset("237079")

如下代碼:

logger = logging.getLogger("azure")

ADDRESS = "amqps://xxx.servicebus.windows.net/xx"
USER = "RootManageSharedAccessKey"
KEY = "xxx"

CONSUMER_GROUP = "$default"

#set the offset
OFFSET = Offset("237079")
PARTITION = "0"

total = 0
last_sn = -1
last_offset = "-1"
client = EventHubClient(ADDRESS, debug=False, username=USER, password=KEY)
try:
    receiver = client.add_receiver(
        CONSUMER_GROUP, PARTITION, prefetch=5000, offset=OFFSET)
    client.run()
    start_time = time.time()
    print("**begin receive**")
    for event_data in receiver.receive(timeout=100):
        last_offset = event_data.offset.value
        last_sn = event_data.sequence_number
        print("Received: {}, last_offset: {}, last_sn: {}".format(event_data.body_as_str(encoding='UTF-8'),last_offset,last_sn))        
        total += 1

    end_time = time.time()
    client.stop()
    run_time = end_time - start_time
    print("Received {} messages in {} seconds".format(total, run_time))

except KeyboardInterrupt:
    pass
finally:
    client.stop()

執行代碼后,只返回指定偏移量的事件數據。 截圖如下:

在此處輸入圖像描述

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM