简体   繁体   English

Python - 一次循环 N 条记录,然后重新开始

[英]Python - loop through N records at a time and then start again

I'm trying to write a script that calls Google Translation API in order to translate each line from an Excel file that has 1000 lines.我正在尝试编写一个调用Google Translation API的脚本,以便从具有 1000 行的 Excel 文件中翻译每一行。

I'm using pandas to load and to read the values from a specific values and then I append the data frame to a list and then I use Google API to translate:我正在使用pandas加载并从特定值中读取值,然后我将 append 数据框添加到列表中,然后我使用Google API进行翻译:

import os
from google.cloud import translate_v2 as translate
import pandas as pd
from datetime import datetime

# Variable for GCP service account credentials

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = r'path to credentials json'

# Path to the file

filepath = r'../file.xlsx'

# Instantiate the Google Translation API Client

translate_client = translate.Client()

# Read all the information from the Excel file within 'test' sheet name

df = pd.read_excel(filepath, sheet_name='test')

# Define an empty list

elements = []

# Loop the data frame and append the list


for i in df.index:
    elements.append(df['EN'][i])

# Loop the list and translate each line
for item in elements:
    output = translate_client.translate(
        elements,
        target_language='fr'
    )


result = [
    element['translatedText'] for element in output
]

print("The values corresponding to key : " + str(result))

After I append to the list the total number of the elements will be 1000. The problem with Google Translation API is that if you are sending multiple segments they call it, it returns the below error:在我 append 到列表后,元素的总数将为 1000。 Google Translation API的问题是,如果您发送多个他们调用它的段,它会返回以下错误:

400 POST https://translation.googleapis.com/language/translate/v2 : Too many text segments 400 POST https://translation.googleapis.com/language/translate/v2 :文本段太多

I've investigated it and I have seen that sending 100 lines (in my case) would be a solution.我已经对其进行了调查,并且发现发送 100 行(在我的情况下)将是一个解决方案。 Now I am a bit stuck.现在我有点卡住了。

How would I have to write the loop to iterate 100 lines at a time, to translate those 100 lines and then do something with the result, and then proceed with the other 100 and so on until it gets to the end?我将如何编写循环以一次迭代 100 行,翻译这 100 行,然后对结果进行处理,然后继续处理其他 100 行,依此类推,直到它结束?

Assuming you are able to pass a list into a single translate call, perhaps you could do something like that:假设您能够将列表传递到单个翻译调用中,也许您可以执行以下操作:

# Define a helper to step thru the list in chunks
def chunker(seq, size):
    return (seq[pos : pos + size] for pos in range(0, len(seq), size))

# Then iterate and handle them accordignly
output = []
for chunk in chunker(elements, 100):
    temp = translate_client.translate(
        chunk,
        target_language='fr'
    )
    output.extend(temp)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM