[英]Azure python sdk storage table backup table Return more than 1000 rows
Hope I can get some hints about this issue.希望我能得到一些关于这个问题的提示。
I have written the following code, to make a table backup from one storage to another.我编写了以下代码,将表从一个存储备份到另一个。
query_size = 100
#save data to storage2 and check if there is lefted data in current table,if yes recurrence
def queryAndSaveAllDataBySize(tb_name,resp_data:ListGenerator ,table_out:TableService,table_in:TableService,query_size:int):
for item in resp_data:
#remove etag and Timestamp appended by table service
del item.etag
del item.Timestamp
print("instet data:" + str(item) + "into table:"+ tb_name)
table_in.insert_or_replace_entity(tb_name,item)
if resp_data.next_marker:
data = table_out.query_entities(table_name=tb_name,num_results=query_size,marker=resp_data.next_marker)
queryAndSaveAllDataBySize(tb_name,data,table_out,table_in,query_size)
tbs_out = table_service_out.list_tables()
print(tbs_out)
for tb in tbs_out:
table = tb.name + today
print(target_connection_string)
#create table with same name in storage2
table_service_in.create_table(table_name=table, fail_on_exist=False)
#first query
data = table_service_out.query_entities(tb.name,num_results=query_size)
queryAndSaveAllDataBySize(table,data,table_service_out,table_service_in,query_size)
This should be a simple script to loop over items in a table and copy them into another storage account.这应该是一个简单的脚本,用于遍历表中的项目并将它们复制到另一个存储帐户。 I have this exact code already running in an azure function and it is working just fine.
我已经在 azure function 中运行了这个确切的代码,它工作得很好。
Today I tried to run it against several storage accounts, for a bit runs just fine, but then it stops and throws this error:今天我尝试针对多个存储帐户运行它,运行得很好,但随后它停止并抛出此错误:
Traceback (most recent call last):
File "/Users/users/Desktop/AzCopy/blob.py", line 205, in <module>
queryAndSaveAllDataBySize(table,data,table_service_out,table_service_in,query_size)
File "/Users/users/Desktop/AzCopy/blob.py", line 191, in queryAndSaveAllDataBySize
data = table_out.query_entities(table_name=tb_name,num_results=query_size,marker=resp_data.next_marker)
File "/Users/users/miniforge3/lib/python3.9/site-packages/azure/cosmosdb/table/tableservice.py", line 738, in query_entities
resp = self._query_entities(*args, **kwargs)
File "/Users/users/miniforge3/lib/python3.9/site-packages/azure/cosmosdb/table/tableservice.py", line 801, in _query_entities
return self._perform_request(request, _convert_json_response_to_entities,
File "/Users/users/miniforge3/lib/python3.9/site-packages/azure/cosmosdb/table/tableservice.py", line 1106, in _perform_request
return super(TableService, self)._perform_request(request, parser, parser_args, operation_context)
File "/Users/users/miniforge3/lib/python3.9/site-packages/azure/cosmosdb/table/common/storageclient.py", line 430, in _perform_request
raise ex
File "/Users/users/miniforge3/lib/python3.9/site-packages/azure/cosmosdb/table/common/storageclient.py", line 358, in _perform_request
raise ex
File "/Users/users/miniforge3/lib/python3.9/site-packages/azure/cosmosdb/table/common/storageclient.py", line 343, in _perform_request
_http_error_handler(
File "/Users/users/miniforge3/lib/python3.9/site-packages/azure/cosmosdb/table/common/_error.py", line 115, in _http_error_handler
raise ex
azure.common.AzureMissingResourceHttpError: Not Found
{"odata.error":{"code":"TableNotFound","message":{"lang":"en-US","value":"The table specified does not exist.\nRequestId:bbdb\nTime:2021-09-29T16:42:17.6078186Z"}}}
Which I do not understand exactly why is this happening.我不明白为什么会这样。 Because all it has to do is copy from one side to another.
因为它所要做的就是从一侧复制到另一侧。
Please if anyone can help to fix this, I am totally burnout and can't think anymore:(请如果有人可以帮助解决这个问题,我完全精疲力竭,不能再想了:(
UPDATE: Reading again my code I figured I have this limitation here.更新:再次阅读我的代码,我认为我在这里有这个限制。
#query 100 items per request, in case of consuming too much menory load all data in one time
query_size = 100
When I check my storage table, in fact I have only 100 rows.当我检查我的存储表时,实际上我只有 100 行。 But I couldn't find anywhere how I can set the query size to load all the data in one time.
但是我在任何地方都找不到如何设置查询大小以一次性加载所有数据。
As far as I can understand, is that after I reach the query_size limit, I need to look for the next x_ms_continuation
token to get the next batch.据我所知,在我达到 query_size 限制后,我需要寻找下一个
x_ms_continuation
令牌来获取下一批。
I have this code right now:我现在有这段代码:
query_size = 100
#save data to storage2 and check if there is lefted data in current table,if yes recurrence
def queryAndSaveAllDataBySize(tb_name,resp_data:ListGenerator ,table_out:TableService,table_in:TableService,query_size:int):
for item in resp_data:
#remove etag and Timestamp appended by table service
del item.etag
del item.Timestamp
print("instet data:" + str(item) + "into table:"+ tb_name)
table_in.insert_or_replace_entity(tb_name,item)
if resp_data.next_marker:
data = table_out.query_entities(table_name=tb_name,num_results=query_size,marker=resp_data.next_marker)
queryAndSaveAllDataBySize(tb_name,data,table_out,table_in,query_size)
tbs_out = table_service_out.list_tables()
print(tbs_out)
for tb in tbs_out:
table = tb.name + today
print(target_connection_string)
#create table with same name in storage2
table_service_in.create_table(table_name=table, fail_on_exist=False)
#first query
data = table_service_out.query_entities(tb.name,num_results=query_size)
queryAndSaveAllDataBySize(table,data,table_service_out,table_service_in,query_size)
According to Microsoft documentation, the marker
should check if there is any continuation token, and if its true, it should rerun the code.根据 Microsoft 文档,
marker
器应检查是否存在任何延续标记,如果为真,则应重新运行代码。 But this is not happening in my case, once I reach the query_size
the code throws the error.但这在我的情况下没有发生,一旦我达到
query_size
代码就会抛出错误。
Anyone can help please?有人可以帮忙吗?
Try to replace the for block with below which is to create the table with the same name in storage 2:尝试将for 块替换为以下内容,即在存储 2 中创建具有相同名称的表:
for tb in tbs_out:
#create table with same name in storage2
table_service_in.create_table(tb.name)
#first query
data = table_service_out.query_entities(tb.name,num_results=query_size)
queryAndSaveAllDataBySize(tb.name,data,table_service_out,table_service_in,query_size)
Below is the full sample code:以下是完整的示例代码:
from azure.cosmosdb.table.tableservice import TableService,ListGenerator
table_service_out = TableService(account_name='', account_key='')
table_service_in = TableService(account_name='', account_key='')
#query 100 items per request, in case of consuming too much menory load all data in one time
query_size = 100
#save data to storage2 and check if there is lefted data in current table,if yes recurrence
def queryAndSaveAllDataBySize(tb_name,resp_data:ListGenerator ,table_out:TableService,table_in:TableService,query_size:int):
for item in resp_data:
#remove etag and Timestamp appended by table service
del item.etag
del item.Timestamp
print("instet data:" + str(item) + "into table:"+ tb_name)
table_in.insert_entity(tb_name,item)
if resp_data.next_marker:
data = table_out.query_entities(table_name=tb_name,num_results=query_size,marker=resp_data.next_marker)
queryAndSaveAllDataBySize(tb_name,data,table_out,table_in,query_size)
tbs_out = table_service_out.list_tables()
for tb in tbs_out:
#create table with same name in storage2
table_service_in.create_table(tb.name)
#first query
data = table_service_out.query_entities(tb.name,num_results=query_size)
queryAndSaveAllDataBySize(tb.name,data,table_service_out,table_service_in,query_size)
Till here this should work properly, if you still have issue with query_size, get the whole data of the table, get the list of 100 records as below: Instead of setting the query_size = 100, we can follow in the below way which will give us the 100 records:到这里这应该可以正常工作,如果您仍然对 query_size 有疑问,获取表的全部数据,如下所示获取 100 条记录的列表: 我们可以按照下面的方式而不是设置 query_size = 100,这将给出我们的 100 条记录:
tasks = table_service.query_entities('tasktable')
lst = list(tasks)
print(lst[99])
Also check for below sample from azure-sdk-for-python还要检查来自azure-sdk-for-python的以下示例
def sample_query_entities_values(self):
from azure.data.tables import TableClient
from azure.core.exceptions import HttpResponseError
print("Entities with 25 < Value < 50")
# [START query_entities]
with TableClient.from_connection_string(self.connection_string, self.table_name) as table_client:
try:
parameters = {u"lower": 25, u"upper": 50}
name_filter = u"Value gt @lower and Value lt @upper"
queried_entities = table_client.query_entities(
query_filter=name_filter, select=[u"Value"], parameters=parameters
)
for entity_chosen in queried_entities:
print(entity_chosen)
except HttpResponseError as e:
print(e.message)
# [END query_entities]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.