简体   繁体   English

如果字符串“包含”substring,则添加带有条件的新列?

[英]Add a new column with condition if a string 'contains' substring?

I want to add a new column 'check' with the following condition:我想添加一个具有以下条件的新列“检查”:

  • 'Suppression total' and 'Sup-SDM'. “抑制总数”和“Sup-SDM”。

OR或者

  • Suppression partiel and Franc SUP - Geisi Suppression partiel 和 Franc SUP - Geisi

Dataframe : Dataframe

Type类型 Info信息
Sup_EF - SUP - SDM Sup_EF - SUP - SDM 2021-12-08 16:47:51.0-Suppression totale 2021-12-08 16:47:51.0-抑制总数
Modif_EF - SUP - SDM Modif_EF - SUP - SDM 2021-12-08 16:47:51.0-Creation 2021-12-08 16:47:51.0-创作
Sup_EF - SUP - Geisi Sup_EF - SUP - 盖斯 2021-12-08 16:47:51.0-Suppression totale 2021-12-08 16:47:51.0-抑制总数
Modif_EF - Franc SUP - Geisi Modif_EF - 法郎 SUP - Geisi 2021-12-17 10:50:40.0-Suppression partiel 2021-12-17 10:50:40.0-压制党

Desired output :所需的 output

Type类型 Info信息 Check查看
Sup_EF - SUP - SDM Sup_EF - SUP - SDM 2021-12-08 16:47:51.0-Suppression total 2021-12-08 16:47:51.0-抑制总数 Correct正确的
Modif_EF - SUP - SDM Modif_EF - SUP - SDM 2021-12-08 16:47:51.0-Creation 2021-12-08 16:47:51.0-创作 Fail失败
Sup_EF - SUP - Geisi Sup_EF - SUP - 盖斯 2021-12-08 16:47:51.0-Suppression total 2021-12-08 16:47:51.0-抑制总数 Fail失败
Modif_EF - Franc SUP - Geisi Modif_EF - 法郎 SUP - Geisi 2021-12-17 10:50:40.0-Suppression partiel 2021-12-17 10:50:40.0-压制党 Correct正确的

Code :代码

if ('SUP - SDM' in df["Type"].values) and ('Suppression total' in df['Info'].values):
    df['Check'] = "Correct"
elif ('Franc SUP - Geisi' in df["Type"].values) and ('Suppression partiel' in df['Info'].values):
    df['Check'] = "Correct"
else:
    df['Check'] = "Fail"

But my output looks like this:但我的 output 看起来像这样:

Type类型 Info信息 Check查看
Sup_EF - SUP - SDM Sup_EF - SUP - SDM 2021-12-08 16:47:51.0-Suppression total 2021-12-08 16:47:51.0-抑制总数 Fail失败
Modif_EF - SUP - SDM Modif_EF - SUP - SDM 2021-12-08 16:47:51.0-Creation 2021-12-08 16:47:51.0-创作 Fail失败
Sup_EF - SUP - Geisi Sup_EF - SUP - 盖斯 2021-12-08 16:47:51.0-Suppression total 2021-12-08 16:47:51.0-抑制总数 Fail失败
Modif_EF - Franc SUP - Geisi Modif_EF - 法郎 SUP - Geisi 2021-12-17 10:50:40.0-Suppression partiel 2021-12-17 10:50:40.0-压制党 Fail失败

Or when i used this code, it shows Keyerror: 'Info'或者当我使用此代码时,它显示 Keyerror: 'Info'

df['Check'] = df.apply(lambda x: 'Correct' if ('Suppression total' in x['Info'] and 'Sup-SDM' in x['Type']) or ('Suppression partiel' in x['Info'] and 'Franc SUP - Geisi' in x['Type']) else 'Fail')

You might want to use numpy as it can be extended to have more than two conditions and result if needed easily:您可能想要使用 numpy,因为它可以扩展为具有两个以上的条件,并且如果需要很容易得到结果:

df['check'] = np.where((df.Type.str.contains('SUP - SDM') & df.Info.str.contains('Suppression total')
                       | (df.Type.str.contains('Franc SUP - Geisi') & (df.Info.str.contains('Suppression partiel')))),'correct','fail')

You need add axis=1 to apply on rows and fix Sup-SDM to SUP - SDM您需要添加axis=1以应用于行并将Sup-SDM修复为SUP - SDM

df['Check'] = df.apply(lambda x: 'Correct' if ('Suppression total' in x['Info'] and 'SUP - SDM' in x['Type']) or ('Suppression partiel' in x['Info'] and 'Franc SUP - Geisi' in x['Type']) else 'Fail', axis=1)

Better is to np.where ,更好的是np.where

m1 = ( df['Info'].str.contains('Suppression total')  & df['Type'].str.contains('SUP - SDM'))
df['Check'] = np.where(m1 | m2, 'Correct', 'Fail')

You can row-wise apply a function to the dataframe that checks whether or not the strings are in the columns.您可以按行将 function 应用于 dataframe 以检查字符串是否在列中。

df = pd.DataFrame({'Type': {0: 'Sup_EF - SUP - SDM',
  1: 'Modif_EF - SUP - SDM',
  2: 'Sup_EF - SUP - Geisi',
  3: 'Modif_EF - Franc SUP - Geisi'},
 'Info': {0: '2021-12-08 16:47:51.0-Suppression totale',
  1: '2021-12-08 16:47:51.0-Creation',
  2: '2021-12-08 16:47:51.0-Suppression totale',
  3: '2021-12-17 10:50:40.0-Suppression partiel'},
 'Check': {0: 'good', 1: 'not good', 2: 'not good', 3: 'good'}})

def f(s):
    if ("SUP - SDM" in s['Type'] and "Suppression total" in s['Info']) or ("Franc SUP - Geisi" in s['Type'] and "Suppression partiel" in s['Info']):
        return "Correct"
    else:
        return "Fail"
    
df['Check'] = df.apply(f, axis=1)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如果字符串“包含”子字符串,则创建一个带有条件的新列? - Create a new column with condition if a string 'contains' substring? 在熊猫中,检查主字符串是否包含列表中的字符串,是否确实从主字符串中删除了子字符串并将其添加到新列中 - In pandas, check if a master string contains a string from a list, if it does remove the substring from the master string and add it to a new column 如果字符串包含某些子字符串,则按条件过滤 - Filter by condition if string contains certain substring 使用 Pandas 删除列中字符串的子字符串,当另一列包含此子字符串时 - with Pandas removing substring of the string in a column, when another column contains this substring Function 根据另一个字符串列的正则表达式条件使用 substring 创建新列 - python - Function to create new column with substring based on regex condition of another string column - python Pandas - 检查列是否包含字符串的子字符串 - Pandas - Check if a column contains a substring of a string Pandas:创建新列并根据字符串列中的值(子字符串)和另一列上的值添加值 - Pandas: Create new column and add value depending on value (substring) in a string column and value on another column Pandas 添加一个带有字符串的新列,其中单元格匹配特定条件 - Pandas add a new column with a string where the cell match a particular condition 如何在新列中添加具有特定条件的列的字符串值 - How to add string values of columns with a specific condition in a new column 熊猫:如果一列中的值包含子字符串,则创建新列 - Pandas: create new column if value from one column contains a substring
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM