简体   繁体   中英

pandas append column from dataframe to a list in python

i have a dataframe as below

id text
1 aaaa
2 bbbb

i read the above to a dataframe and i need to convert the text column to a list for perform NER extraction

tags = []
for i in df['text'].tolis(():
  tdoc = nlp(i)
  for tags in tdoc.ents:
   tags.append((df.id,tags.text,tags.label_))

Above works and i get the NER tags which i would like to export to dataframe along with the 'id' column from the dataframe

df_tag = pd.DataFrame_from_records(tags, columns = ['id', 'name', 'type'])

The problem here is my id columns repeats as below

id name type
1 2 NER A Type A
1 2 NER B Type B

Desired output

id name type
1 NER A Type A
2 NER B Type B

The problem comes from the fact that df.id returns a Series, from which you are repeatedly appending the index, not the values.

Also, lines 4 and 5, it should be tag , not tags .

Try like this:

tags = []
for i in df['text'].tolist():
    tdoc = nlp(i)
    for tag in tdoc.ents:
        tags.append((df.id.values,tag.text,tag.label_))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM