![](/img/trans.png)
[英]Iterate through row if statement and add to new columns [Pandas/Python]
[英]Python - Pandas - Import Excel file, iterate through each row, add new value, and add to dataframe
我有一個帶有項目代碼和需要導入的抽象字段的Excel文件,因此可以在摘要上運行一個簡單的文本摘要程序,然后將其添加到數據框中。
我的Excel數據集如下所示:
[Proj_Number] | [Abstract]
JJF-123 | Diabetes is a serious chronic condition.
JFR-223 | Cardiovascular disease is also a chronic condition.
JF3-334 | Don't forget about asthma and how much it sucks.
導入數據后,我想應用我的文本匯總器並獲取以下信息:
[Proj_Number] | [Abstract] [Ab_keywords]
JJF-123 | Diabetes is a chronic condition. |Diabetes, chronic condition
JFR-223 | COPD is a also chronic condition. | COPD, chronic condition
JF3-334 | Don't forget about asthma too. | asthma, forgot
我知道我的代碼是錯誤的,但是我只是不知道如何遍歷每一行,如何從摘要中獲取摘要關鍵字,將其添加到數據框並導出。
from gensim.summarization.summarizer import summarize
from gensim.summarization import keywords
import pandas as pd
dataset = pd.read_excel('abstracts.xlsx',encoding="ISO-8859-1")
df = pd.DataFrame(dataset)
cols = [1,2]
df = df[df.columns[cols]]
for d in df:
d = keywords(d, ratio=0.15, split=True))
print(d)
您不想使用df中for d in df:
遍歷df中的每一行for d in df:
熊貓有一種將函數應用於數據框的每一行並通過apply
函數返回一系列數據的方法
為您適當地重命名數據框的列,
df['Ab_keywords'] = df['Abstract'].apply(lambda text: keywords(text, ratio=0.15, split=True))
應該管用。
這里,lambda函數應用於df['Abstract']
的每一行,並被賦予每一行的值作為其參數。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.