简体   繁体   English

基于在DataFrame中找到的字符串的Pandas New Column

[英]Pandas New Column based on found string in a DataFrame

Trying to match an ID value in one DataFrame to a string column in another DataFrame to create a new ID field. 尝试将一个DataFrame中的ID值与另一个DataFrame中的字符串列进行匹配,以创建一个新的ID字段。

I have two dataframes, one with an text ID column only: 我有两个数据框,一个只有文本ID列:

DF1 DF1

ID
elf
orc
panda

And another dataframe with a different ID but a text column that would contain the ID value from the first DataFrame (DF1): 另一个具有不同ID的数据框,但一个文本列包含第一个DataFrame(DF1)中的ID值:

DF2 DF2

AltID Text
1     The orc killed the dwarf
2     The elf lives in the woods
3     The panda eats bamboo

That way I can create New ID column in the second Dataframe (DF2) that would look like this if the text is found: 这样,我可以在第二个数据框(DF2)中创建“新ID”列,如果找到该文本,它将看起来像这样:

NewID
orc
elf
panda

Should I use a lambda function or an np.where()? 我应该使用lambda函数还是np.where()?

Thanks in advance. 提前致谢。

EDIT: 编辑:

What if it needs to be an exact match? 如果需要完全匹配怎么办? For instance I have this row of text but don't want to match 'orc' 例如,我有这行文字,但不想匹配'orc'

AltID  Text
4      The orchestra played too long

and wanted it to output 'None', N/A or something of that nature for the NewID? 并希望它为NewID输出“无”,N / A或类似性质的东西?

Straightforward using str.extract : 直接使用str.extract

df2['New ID'] = df2.Text.str.extract('({})'.format('|'.join(df1.ID)), expand=False)

df2

   AltID                        Text New ID
0      1    The orc killed the dwarf    orc
1      2  The elf lives in the woods    elf
2      3       The panda eats bamboo  panda

A small trick . 一个小把戏。

df2.Text.replace(dict(zip(df1.ID,df1.index)),regex=True).map(df1.ID)
Out[1004]: 
0      orc
1      elf
2    panda
Name: Text, dtype: object

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pandas有条件的新列基于在其他数据框列中找到的期间 - Pandas Conditional new column based on period found in other dataframe column 根据上一行的值在熊猫数据框中创建一个新列 - Create a new column in a pandas dataframe based on values found on a previous row 根据Pandas Dataframe中一列中的字符串将值传递给新列 - Passing values to new columns based on string in one column in a Pandas Dataframe 基于过滤器在Pandas DataFrame中创建新列 - Making new column in pandas DataFrame based on filter 基于布尔条件的 Pandas 数据框中的新列 - New column in Pandas dataframe based on boolean conditions 基于数据框的其他列创建一个新的熊猫数据框列 - Create a new pandas dataframe column based on other column of the dataframe Pandas:根据 DataFrame 中的其他列在 DataFrame 中创建新列 - Pandas: Create new column in DataFrame based on other column in DataFrame 如何基于另一个DataFrame中的列在Pandas DataFrame中创建新列? - How to create a new column in a Pandas DataFrame based on a column in another DataFrame? 根据另一列中字符串的特定字符 pandas 创建新的 dataframe 列 - Create new dataframe column based on a specific character of a string in another column, pandas 根据现有列的部分字符串内容向新的Pandas数据框列添加值 - Adding values to new Pandas dataframe column based on partial string contents of existing column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM