如何基於一個或多個 OTHER 列的條件子字符串搜索在 Pandas 數據框中創建一列

Question

我有以下數據框：

import pandas as pd

df = pd.DataFrame({'Manufacturer':['Allen Edmonds', 'Louis Vuitton 23', 'Louis Vuitton 8', 'Gulfstream', 'Bombardier', '23 - Louis Vuitton', 'Louis Vuitton 20'],
                   'System':['None', 'None', '14 Platinum', 'Gold', 'None', 'Platinum 905', 'None']
                  })

如果滿足以下條件，我想在名為Pricing的數據框中創建另一列，其中包含值“East Coast”：

a) 如果Manufacturer列中的子字符串與“Louis”匹配，

和

b) 如果System列中的子字符串匹配“Platinum”

以下代碼對單個列進行操作：

df['Pricing'] = np.where(df['Manufacturer'].str.contains('Louis'), 'East Coast', 'None')

我嘗試使用 AND 將其鏈接在一起：

df['Pricing'] = np.where(df['Manufacturer'].str.contains('Louis'), 'East Coast', 'None') and np.where(df['Manufacturer'].str.contains('Platimum'), 'East Coast', 'None')

但是，我收到以下錯誤：

ValueError: The truth value of an array with more than one element is ambiguous. Use `a.any()` or `a.all()`

鑒於上面的兩個條件“a”和“b”，任何人都可以幫助我如何實現a.any()或a.all()嗎？ 或者，也許有一種更有效的方法可以在不使用np.where情況下創建此列？

提前致謝！

Answer 1

根據您的條件，使用.loc對數據幀進行切片：

df.loc[(df['Manufacturer'].str.contains('Louis')) & 
       (df['System'].str.contains('Platinum')),
      'Pricing'] = 'East Coast'
df

    Manufacturer        System       Pricing
0   Allen Edmonds       None         NaN
1   Louis Vuitton 23    None         NaN
2   Louis Vuitton 8 14  Platinum     East Coast
3   Gulfstream          Gold         NaN
4   Bombardier          None         NaN
5   23 - Louis Vuitton  Platinum 905 East Coast
6   Louis Vuitton 20    None         NaN

Answer 2

def contain(x):
    if 'Louis' in x.Manufacturer and 'Platinum' in x.System:
        return "East Coast" 

df['pricing'] = df.apply(lambda x:contain(x),axis = 1)

如何基於一個或多個 OTHER 列的條件子字符串搜索在 Pandas 數據框中創建一列

問題描述

2 個解決方案

解決方案1
2 已采納 2020-11-15 00:21:06

解決方案2
1 2020-11-15 00:22:11

如何基於一個或多個 OTHER 列的條件子字符串搜索在 Pandas 數據框中創建一列

問題描述

2 個解決方案

解決方案1 2 已采納 2020-11-15 00:21:06

解決方案2 1 2020-11-15 00:22:11

解決方案1
2 已采納 2020-11-15 00:21:06

解決方案2
1 2020-11-15 00:22:11