![](/img/trans.png)
[英]Creating a new column in dataframe based on multiple conditions from other rows and columns? Including rows that are null? - Python/Pandas
[英]Create new Python DataFrame column based on conditions of multiple other columns
我正在嘗試根據其他兩列的輸入創建一個新的 DataFrame 列(C 列)。 我的兩個標准是,如果“A 列 > 0”或“B 列包含字符串“Apple”,* 那么 C 列的值應為“是”,否則應為“否”
*如果答案不區分大小寫,則加分(即,它將拿起“菠蘿”和“蘋果汁”中的“蘋果”
數據可能看起來像(以及 C 列應該產生什么結果)
Column_A Column_B Column_C
23 Orange Juice Yes
2 Banana Smoothie Yes
8 Pineapple Juice Yes
0 Pineapple Smoothie Yes
0 Apple Juice Yes
0 Lemonade No
34 Coconut Water Yes
我嘗試了幾件事,包括:
df['Keep6']= np.where((df['Column_A'] >0) | (df['Column_B'].find('Apple')>0) , 'Yes','No')
但收到錯誤消息: "AttributeError: 'Series' object has no attribute 'find'"
將Series.str.contains與case=False
一起使用以不區分大小寫:
df['Column_C']= np.where((df['Column_A']>0) | (df['Column_B'].str.contains('apple', case=False)) ,'Yes','No')
print(df)
Column_A Column_B Column_C
0 23 Orange_Juice Yes
1 2 Banana_Smoothie Yes
2 8 Pineapple_Juice Yes
3 0 Pineapple_Smoothie Yes
4 0 Apple_Juice Yes
5 0 Lemonade No
6 34 Coconut_Water Yes
試試這個代碼,使用pandas.Dataframe.apply function:
df['Column_C'] = df.apply(lambda row: 'Yes' if (row['Column_A']>0) | (row['Column_B'].lower().find('apple')>=0) else 'No', axis=1)
並給出:
Column_A Column_B Column_C
0 23 Orange Juice Yes
1 2 Banana Smoothie Yes
2 8 Pineapple Juice Yes
3 0 Pineapple Smoothie Yes
4 0 Apple Juice Yes
5 0 Lemonade No
6 34 Coconut Water Yes
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.