简体   繁体   English

如何基于一个或多个 OTHER 列的条件子字符串搜索在 Pandas 数据框中创建一列

[英]How to create a column in a Pandas dataframe based on a conditional substring search of one or more OTHER columns

I have the following data frame:我有以下数据框:

import pandas as pd

df = pd.DataFrame({'Manufacturer':['Allen Edmonds', 'Louis Vuitton 23', 'Louis Vuitton 8', 'Gulfstream', 'Bombardier', '23 - Louis Vuitton', 'Louis Vuitton 20'],
                   'System':['None', 'None', '14 Platinum', 'Gold', 'None', 'Platinum 905', 'None']
                  })

I would like to create another column in the data frame named Pricing , which contains the value "East Coast" if the following conditions hold:如果满足以下条件,我想在名为Pricing的数据框中创建另一列,其中包含值“East Coast”:

a) if a substring in the Manufacturer column matches "Louis", a) 如果Manufacturer列中的子字符串与“Louis”匹配,

AND

b) if a substring in the System column matches "Platinum" b) 如果System列中的子字符串匹配“Platinum”

The following code operates on a single column:以下代码对单个列进行操作:

df['Pricing'] = np.where(df['Manufacturer'].str.contains('Louis'), 'East Coast', 'None')

I tried to chain this together using AND:我尝试使用 AND 将其链接在一起:

df['Pricing'] = np.where(df['Manufacturer'].str.contains('Louis'), 'East Coast', 'None') and np.where(df['Manufacturer'].str.contains('Platimum'), 'East Coast', 'None')

But, I get the following error:但是,我收到以下错误:

ValueError: The truth value of an array with more than one element is ambiguous. Use `a.any()` or `a.all()`

Can anyone help with how I would implement a.any() or a.all() given the two conditions "a" and "b" above?鉴于上面的两个条件“a”和“b”,任何人都可以帮助我如何实现a.any()a.all()吗? Or, perhaps there is a more efficient way to create this column without using np.where ?或者,也许有一种更有效的方法可以在不使用np.where情况下创建此列?

Thanks in advance!提前致谢!

Using .loc to slice the dataframe, according to your conditions:根据您的条件,使用.loc对数据帧进行切片:

df.loc[(df['Manufacturer'].str.contains('Louis')) & 
       (df['System'].str.contains('Platinum')),
      'Pricing'] = 'East Coast'
df

    Manufacturer        System       Pricing
0   Allen Edmonds       None         NaN
1   Louis Vuitton 23    None         NaN
2   Louis Vuitton 8 14  Platinum     East Coast
3   Gulfstream          Gold         NaN
4   Bombardier          None         NaN
5   23 - Louis Vuitton  Platinum 905 East Coast
6   Louis Vuitton 20    None         NaN
def contain(x):
    if 'Louis' in x.Manufacturer and 'Platinum' in x.System:
        return "East Coast" 

df['pricing'] = df.apply(lambda x:contain(x),axis = 1)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在一个 Pandas 数据帧列中搜索字符串作为另一个数据帧中的子字符串 - How to search a string in one pandas dataframe column as a substring in another dataframe Pandas:如何根据其他列值的条件创建对其他列求和的列? - Pandas: How create columns where sum other columns based on conditional of other column values? Pandas 根据其他列的累积逻辑运算创建条件列 - Pandas create a conditional column based on cumulative logic operations of the other columns 如何在Pandas Data Frame中创建条件列,其中列值基于其他列 - How to create conditional columns in Pandas Data Frame in which column values are based on other columns 根据字符串是否是 pandas Dataframe 中的子字符串创建一列 - Create a column based on if a string is a substring in pandas Dataframe Pandas数据框基于其他数据框的列创建新列 - Pandas dataframe create a new column based on columns of other dataframes Pandas DataFrame 基于其他两列创建新的 csv 列 - Pandas DataFrame create new csv column based on two other columns 根据 pandas dataframe 中的其他三列更改一列的值 - Changing values of one column based on the other three columns in pandas dataframe 根据其他列中的“NaN”值在 Pandas Dataframe 中创建一个新列 - Create a new column in Pandas Dataframe based on the 'NaN' values in other columns 基于其他列在 Pandas DataFrame 中创建新列 - Create new column in Pandas DataFrame based on other columns
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM