简体   繁体   English

将已处理的文本存储在pandas数据框中

[英]Storing processed text in pandas dataframe

I've used gensim for text summarizing in Python. 我已经使用gensim在Python中进行文本汇总。 I want my summarized output to be stored in a different column in the same dataframe. 我希望将汇总的输出存储在同一数据框中的不同列中。

I've used this code: 我使用了以下代码:

for n, row in df_data_1.iterrows():
        text=df_data_1['Event Description (SAP)']
        print(text)
        *df_data_1['Summary']=summarize(text)*
print(df_data_1['Summary'])

The error is coming on line 4 of this code, which states: TypeError: expected string or bytes-like object. 该代码的第4行出现错误,该错误指出:TypeError:预期的字符串或类似字节的对象。

How to store the processed text in the pandas dataframe 如何将处理后的文本存储在pandas数据框中

If it's not string or bytes-like, what is it? 如果不是字符串或字节,那是什么? You could check the type of your summarize function and move forward from there. 您可以检查汇总功能的类型,然后从那里继续前进。

test_text = df_data_1['Event Description (SAP)'].iloc[0]
print(type(summarize(test_text))

Another remark: typically you'd want to avoid looping over a dataframe (see discussion ). 另一点评论:通常,您希望避免在数据框上循环(请参阅讨论 )。 If you want to apply a function to an entire column, use df.apply() as follows: 如果要将函数应用于整个列,请按以下方式使用df.apply()

df_data1[‘Summary’] = df_data1['Event Description (SAP)'].apply(lambda x: summarize(x))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM