Python - 一次循環 N 條記錄，然后重新開始

Question

我正在嘗試編寫一個調用Google Translation API的腳本，以便從具有 1000 行的 Excel 文件中翻譯每一行。

我正在使用pandas加載並從特定值中讀取值，然后我將 append 數據框添加到列表中，然后我使用Google API進行翻譯：

import os
from google.cloud import translate_v2 as translate
import pandas as pd
from datetime import datetime

# Variable for GCP service account credentials

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = r'path to credentials json'

# Path to the file

filepath = r'../file.xlsx'

# Instantiate the Google Translation API Client

translate_client = translate.Client()

# Read all the information from the Excel file within 'test' sheet name

df = pd.read_excel(filepath, sheet_name='test')

# Define an empty list

elements = []

# Loop the data frame and append the list


for i in df.index:
    elements.append(df['EN'][i])

# Loop the list and translate each line
for item in elements:
    output = translate_client.translate(
        elements,
        target_language='fr'
    )


result = [
    element['translatedText'] for element in output
]

print("The values corresponding to key : " + str(result))

在我 append 到列表后，元素的總數將為 1000。 Google Translation API的問題是，如果您發送多個他們調用它的段，它會返回以下錯誤：

400 POST https://translation.googleapis.com/language/translate/v2 ：文本段太多

我已經對其進行了調查，並且發現發送 100 行（在我的情況下）將是一個解決方案。 現在我有點卡住了。

我將如何編寫循環以一次迭代 100 行，翻譯這 100 行，然后對結果進行處理，然后繼續處理其他 100 行，依此類推，直到它結束？

Answer 1

假設您能夠將列表傳遞到單個翻譯調用中，也許您可以執行以下操作：

# Define a helper to step thru the list in chunks
def chunker(seq, size):
    return (seq[pos : pos + size] for pos in range(0, len(seq), size))

# Then iterate and handle them accordignly
output = []
for chunk in chunker(elements, 100):
    temp = translate_client.translate(
        chunk,
        target_language='fr'
    )
    output.extend(temp)

Python - 一次循環 N 條記錄，然后重新開始

問題描述

1 個解決方案

解決方案1
1 已采納 2019-11-16 19:28:13

Python - 一次循環 N 條記錄，然后重新開始

問題描述

1 個解決方案

解決方案1 1 已采納 2019-11-16 19:28:13

解決方案1
1 已采納 2019-11-16 19:28:13