简体   繁体   中英

Storing processed text in pandas dataframe

I've used gensim for text summarizing in Python. I want my summarized output to be stored in a different column in the same dataframe.

I've used this code:

for n, row in df_data_1.iterrows():
        text=df_data_1['Event Description (SAP)']
        print(text)
        *df_data_1['Summary']=summarize(text)*
print(df_data_1['Summary'])

The error is coming on line 4 of this code, which states: TypeError: expected string or bytes-like object.

How to store the processed text in the pandas dataframe

If it's not string or bytes-like, what is it? You could check the type of your summarize function and move forward from there.

test_text = df_data_1['Event Description (SAP)'].iloc[0]
print(type(summarize(test_text))

Another remark: typically you'd want to avoid looping over a dataframe (see discussion ). If you want to apply a function to an entire column, use df.apply() as follows:

df_data1[‘Summary’] = df_data1['Event Description (SAP)'].apply(lambda x: summarize(x))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM