[英]pandas append column from dataframe to a list in python
i have a dataframe as below我有一个 dataframe 如下
id ID | text文本 |
---|---|
1 1个 | aaaa啊啊啊 |
2 2个 | bbbb bbbb |
i read the above to a dataframe and i need to convert the text column to a list for perform NER extraction我将上面的内容读到 dataframe,我需要将文本列转换为列表以执行 NER 提取
tags = []
for i in df['text'].tolis(():
tdoc = nlp(i)
for tags in tdoc.ents:
tags.append((df.id,tags.text,tags.label_))
Above works and i get the NER tags which i would like to export to dataframe along with the 'id' column from the dataframe以上工作,我得到了我想导出到 dataframe 的 NER 标签以及来自 dataframe 的“id”列
df_tag = pd.DataFrame_from_records(tags, columns = ['id', 'name', 'type'])
The problem here is my id columns repeats as below这里的问题是我的 id 列重复如下
id ID | name姓名 | type类型 |
---|---|---|
1 2 1 2 | NER A内尔A | Type A A型 |
1 2 1 2 | NER B内尔乙 | Type B B型 |
Desired output所需 output
id ID | name姓名 | type类型 |
---|---|---|
1 1个 | NER A内尔A | Type A A型 |
2 2个 | NER B内尔乙 | Type B B型 |
The problem comes from the fact that df.id
returns a Series, from which you are repeatedly appending the index, not the values.问题来自df.id
返回一个系列,您从中重复附加索引,而不是值。
Also, lines 4 and 5, it should be tag
, not tags
.另外,第 4 行和第 5 行,它应该是tag
,而不是tags
。
Try like this:试试这样:
tags = []
for i in df['text'].tolist():
tdoc = nlp(i)
for tag in tdoc.ents:
tags.append((df.id.values,tag.text,tag.label_))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.