I have the following code that is going row by row and translating a specific column into English from my dataframe, but when I run it, the resulting new column 'translatedv4'. I am new to looping through entire dataframes rather than lists so that may be the issue
Example of a single value (I just want the column to show "I'm thinking this...")
Comments Ich glaube das...
Translations DE
Race / Ethnicity White
Count2 91
translated I'm thinking this because I'm nearing retireme...
Current code:
from googletrans import Translator
import pandas as pd
import xlsxwriter
import xlrd
import copy
##################TRANSLATION
translator = Translator()
file = r"xxxx"
#dt2 = translator.detect(text2)
df = pd.read_excel(file, sheet_name = 'Sheet1', converters={'Comments':str}).fillna(0)
df = df[df['Comments'] != 0]
translatedList = []
for index, row in df.iterrows():
# REINITIALIZE THE API
translator = Translator()
newrow = copy.deepcopy(row)
try:
# translate the 'text' column
translated = translator.translate(row['Comments'], dest='en')
newrow['translated'] = translated.text
except Exception as e:
print(str(e))
continue
translatedList.append(newrow)
df = df.assign(translatedv4 = translatedList)
I'm not quite sure about your problem, so I hope this is what you're looking for. I do think you're not approaching it in the best way however. Generally with pandas, you'll want to try to vectorize your solutions or create a function that you pass to df.apply
. Here are three solutions with increasing complexity. The first one uses a lambda function, which works but it doesn't handle exceptions. The second one creates a normal function, which allows us to do that easily. The last solution ratelimit and tqdm which are nice when working with API's and dataframes.
from googletrans import Translator
import pandas as pd
df = pd.DataFrame({
'German': ['ich glaube das', 'schadenfreude', 'schnappsidee']
})
translator = Translator()
df['English'] = df['German'].apply(
lambda sent: translator.translate(sent, dest='en', src='de').text
)
print(df)
German English
0 ich glaube das I believe that
1 schadenfreude malicious joy
2 schnappsidee snapping idea
from googletrans import Translator
import pandas as pd
def get_trans(sent):
try:
return translator.translate(sent, dest='en', src='de').text
except Exception as e:
print(e)
return np.nan
df = pd.DataFrame({
'German': ['ich glaube das', 'schadenfreude', 'schnappsidee', np.nan]
})
translator = Translator()
df['English'] = df['German'].apply(get_trans)
print(df)
'float' object is not iterable
German English
0 ich glaube das I believe that
1 schadenfreude malicious joy
2 schnappsidee snapping idea
3 NaN NaN
When working with API's, I can really recommend the fantastic ratelimit library. It can help you not ask for too many requests, and handle exceptions. I also added tqdm for a progress bar. This is nice if you have a lot of data.
from googletrans import Translator
import pandas as pd
from ratelimit import limits, sleep_and_retry
from tqdm.autonotebook import tqdm
# from tqdm import tqdm <- use this instead if you're not using jupyter
FIFTEEN_MINUTES = 900
tqdm.pandas()
@sleep_and_retry
@limits(calls=15, period=FIFTEEN_MINUTES)
def get_trans(sent):
try:
return translator.translate(sent, dest='en', src='de').text
except Exception as e:
print(e)
return np.nan
df = pd.DataFrame({
'German': ['ich glaube das', 'schadenfreude', 'schnappsidee', np.nan]
})
translator = Translator()
df['English'] = df['German'].progress_apply(get_trans)
print(df)
German English
0 ich glaube das I believe that
1 schadenfreude malicious joy
2 schnappsidee snapping idea
3 NaN NaN
I think you have a small mistake in your code, here:
translatedList.append(newrow)
you append full row to your list, while you want to append the new value, ie
translatedList.append(translated.text)
But be careful, in case of any exception lenght of translatedList will be less than length of your DataFrame index. Probably you should do something like this:
try:
# translate the 'text' column
translated = translator.translate(row['Comments'], dest='en')
translatedList.append(translated.text)
except Exception as e:
print(str(e))
translatedList.append('ERRROR')
continue
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.