简体   繁体   English

如何通过从另一列中的句子中提取单词来在 pandas 数据框中创建一个新列?

[英]How can I create a new column in a pandas data frame by extracting words from sentences in another column?

I have a pandas dataframe like this.我有一个这样的 pandas dataframe。

import pandas as pd
student_id = ['001', '002', '003', '004']
names = ['Jane', 'Mary', 'Andrew', 
'Paul']
address = ['7 karumu st Ikeja Lagos', '8 
logo street Umuahia Abia', 
       '10 jege close PH Rivers', '9 
Lekki gate Lagos']

test_1 = {'Student_ID': student_id, 
      'Name': names, 
      'Address': address}
df = pd.DataFrame(test_1)
df`

Output Output

and a list like this:和这样的列表:

List = [Imo, Lagos, Abia, Ebonyi, Rivers]

So i am trying to iterate through the Address column and estract the states in the address which is also in the list.所以我试图遍历地址列并提取地址中的状态,该地址也在列表中。 If a state in the list is spotted I would like to extract it and append to a new column called state.如果发现列表中的 state,我想将它和 append 提取到名为 state 的新列中。

I tried to use the iterrows() method but I am a bit lost我尝试使用 iterrows() 方法,但我有点迷路

You can filter like this:您可以像这样过滤:

df = df[df['Address'].str.contains('|'.join(List))]
  • get the 'Adress' Column获取“地址”列
  • convert to 'List' to DataFrame转换为“列表”为 DataFrame
  • After I think 'MERGE' you should use在我认为“MERGE”之后你应该使用
  • Storage to last dafaFrame and add the as a another column存储到最后一个 dafaFrame 并将其添加为另一列

I think this will solve your problem我想这会解决你的问题

Assuming that the state is always the last word in the address.假设 state 始终是地址中的最后一个字。

import numpy as np

states = ["Imo", "Lagos", "Abia", "Ebonyi", "Rivers"]
df["State"] = df["Address"].map(lambda x: state if (state:=x.split()[-1]) in states else np.nan)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何为我的数据框创建一个新列,以从另一列中提取和重新排序数据? - How do I create a new column to my data frame that pulls and reorders the data from another column? 如何在 pandas 数据框中创建新列 - How to create a new column in a pandas data frame 如何从 pandas 数据框的列值创建新行 - How to create a new rows from column values of pandas data frame 如何将列的一部分添加到新的 Pandas 数据框中? - How can I add parts of a column to a new pandas data frame? 如何使用来自另一列的数据创建新的Pandas Dataframe列 - How do I create a new Pandas Dataframe Column with data from another column Pandas:在数据框中创建一个新列,其中的值是从现有列 i 计算出来的。 计算最大值 - Pandas: Create a new column in a data frame with values calculated from an already existing column, i. calculate maximum 如何删除 pandas 数据框列中与另一列中的单词匹配的单词 - How to remove words in pandas data frame column which match with words in another column 如何从描述中提取数字并将其设置到 pandas 数据框的另一列 - How can I extract number from description and set it into another column on pandas data frame pandas:如何根据列值在一个数据帧中从另一个数据帧中 append 行? - pandas: How can I append rows in one data frame from another based on column values? Pandas:来自条件和另一个数据帧的新列 - Pandas: New column from conditions and from another data frame
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM