如何检查pandas列中的所有子字符串是否相同？

Question

I have this column and I want to check if all strings have anr12 substring. 我有此列，我想检查是否所有字符串都具有anr12子字符串。 How to check this? 如何检查？ And if all substrings are the same, how to drop this particular substring? 并且如果所有子字符串都相同，如何删除该特定子字符串？

Answer 1

I think you want check by contains with all for check all True s and then str.replace : 我认为您想通过all contains检查所有True的检查，然后str.replace ：

df = pd.DataFrame({'A':['123anr12', '345anr12']})
print (df)
          A
0  123anr12
1  345anr12

if df['A'].str.contains('anr12').all():
    df['A'] = df['A'].str.replace('anr12','')
print (df)

     A
0  123
1  345

EDIT1: You can use dictionary for lookup: EDIT1：您可以使用dictionary进行查找：

train_df = pd.DataFrame({'477':['123nbf12', '34nbf12'], 
                         '479':['tt1', '32'], 
                         '482':['anr1234', '345anr12a12']})

obj_features = ['477', '479', '482'] #it's column names 
substring = ['nbf', 'tt1', 'anr12'] # get rid of 'nbf', 'tt1', 'anr12' substrings 
d = dict(zip(obj_features, substring))
print (d)
{'477': 'nbf', '479': 'tt1', '482': 'anr12'}

for k, v in d.items():
    if train_df[k].str.contains(v).all(): 
        train_df[k] = train_df[k].str.replace(v,'')         
print (train_df)
     477  479     482
0  12312  tt1      34
1   3412   32  345a12

如何检查pandas列中的所有子字符串是否相同？

问题描述

1 个解决方案

解决方案1
0 已采纳 2018-03-03 13:14:06

如何检查pandas列中的所有子字符串是否相同？

问题描述

1 个解决方案

解决方案1 0 已采纳 2018-03-03 13:14:06

解决方案1
0 已采纳 2018-03-03 13:14:06