简体   繁体   English

Azure Python SDK 数据表

[英]Azure Python SDK data tables

I need help to get through this workflow.我需要帮助来完成这个工作流程。 I have 2 storage accounts which I name storage1 and storage2我有 2 个存储帐户,我命名为storage1storage2

storage1 contrains a list of tables with some data in, and I would like to loop through all those tables, copy their content into storage2 . storage1一个包含一些数据的表列表,我想遍历所有这些表,将它们的内容复制到storage2中。 I tried with azCopy but I had no luck as this feature is available only in azCopy v7.3 and I couldn't find this version for MacOs M1.我尝试使用 azCopy,但运气不佳,因为此功能仅在 azCopy v7.3 中可用,而且我找不到适用于 MacOs M1 的此版本。 The other option is Data factory but its too complex for what I want to achieve.另一种选择是数据工厂,但它对于我想要实现的目标来说太复杂了。 So I decided to go with azure Python sdk.所以我决定 go 和 azure Python ZEAE18BC41E1434DD98FA2DD989531。

As a library I am using azure.data.tables import TableServiceClient作为一个库,我正在使用azure.data.tables import TableServiceClient

The code I wrote looks like this:我写的代码是这样的:

from azure.data.tables import TableServiceClient
my_conn_str_out = 'storage1-Conn-Str'

table_service_client_out = TableServiceClient.from_connection_string(my_conn_str_out)
list_table = []
for table in table_service_client_out.list_tables():
    list_table.append(table.table_name)

my_conn_str_in = 'Storage2-Conn-str'

table_service_client_in = TableServiceClient.from_connection_string(my_conn_str_in)
for new_tables in table_service_client_out.list_tables():
    table_service_client_in.create_table_if_not_exists(new_tables.table_name)
    print(f'tables created successfully {new_tables.table_name}')

this is how I structured my code.这就是我构建代码的方式。

for table in table_service_client_out.list_tables():
    list_table.append(table.table_name)

I loop through all my tables in the storage account and append them into a list.我将存储帐户中的所有表和 append 循环到一个列表中。

then:然后:

for new_tables in table_service_client_out.list_tables():
    table_service_client_in.create_table_if_not_exists(new_tables.table_name)
    print(f'tables created successfully {new_tables.table_name}')

I create the same table in the storage2我在storage2中创建了同一张表

So far everything works just fine.到目前为止,一切正常。

What I would like to achieve now, is to query all the data in each table in storage1 and pass it to the respective table in storage2我现在想要实现的是查询storage1中每个表中的all数据,并将其传递给storage2中的相应表

According to Microsoft documentation I can achieve the query table using this:根据 Microsoft 文档,我可以使用以下方法实现查询表:

query = table_service_client_out.query_tables(filter=table)

so I integrated this in my loop like this:所以我将它集成到我的循环中,如下所示:

for table in table_service_client_out.list_tables():
    query = table_service_client_out.query_tables(filter=table)
    list_table.append(table.table_name)
    print(query)

When I run my python code, I get back the memory allocation of the query and not the data in the tables:当我运行我的 python 代码时,我得到了查询的 memory 分配,而不是表中的数据:

<iterator object azure.core.paging.ItemPaged at 0x7fcd90c8fbb0>
<iterator object azure.core.paging.ItemPaged at 0x7fcd90c8f7f0>
<iterator object azure.core.paging.ItemPaged at 0x7fcd90c8fd60>

I was wondering if there is a way how I can query all the data in my tables and pass them to my storage2我想知道是否有一种方法可以查询表中的所有数据并将它们传递到我的storage2

Try this:尝试这个:

from azure.cosmosdb.table.tableservice import TableService,ListGenerator

table_service_out = TableService(account_name='', account_key='')
table_service_in = TableService(account_name='', account_key='')

#query 100 items per request, in case of consuming too much menory load all data in one time
query_size = 100

#save data to storage2 and check if there is lefted data in current table,if yes recurrence
def queryAndSaveAllDataBySize(tb_name,resp_data:ListGenerator ,table_out:TableService,table_in:TableService,query_size:int):
    for item in resp_data:
        #remove etag and Timestamp appended by table service
        del item.etag
        del item.Timestamp
        print("instet data:" + str(item) + "into table:"+ tb_name)
        table_in.insert_entity(tb_name,item)
    if resp_data.next_marker:
        data = table_out.query_entities(table_name=tb_name,num_results=query_size,marker=resp_data.next_marker)
        queryAndSaveAllDataBySize(tb_name,data,table_out,table_in,query_size)


tbs_out = table_service_out.list_tables()

for tb in tbs_out:
    #create table with same name in storage2
    table_service_in.create_table(tb.name)
    #first query 
    data = table_service_out.query_entities(tb.name,num_results=query_size)
    queryAndSaveAllDataBySize(tb.name,data,table_service_out,table_service_in,query_size)

Of course, this is a simple demo for your requirement.For more efficiency, you can also query table data by partition key and commit them by batch当然,这只是一个简单的demo ,满足你的需求

Let me know if you have any more questions.如果您还有其他问题,请告诉我。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM