从Pandas DF删除以字母和两个数字开头的条目

Question

I am curious as to how to remove string entries from a Pandas DF beginning with a letter and two numbers and replacing with NaN. 我很好奇如何从熊猫DF中删除以字母和两个数字开头并用NaN代替的字符串条目。

A        B         C          D
Apple    Pear      N45 82f    John 
Cat      P48 hH2   Mary       Sponge 
Hat      P67 De1   Bed        S90 GGGF

I would like to replace all entries across the DF beginning with a letter and two numbers with NaN. 我想用NaN替换DF中所有以字母和两个数字开头的条目。

I have tried something along the lines of 我已经尝试了一些方法

for columns in df.columns[1:]:
    for i in columns: 
        if i[0].isalpha() and i[1].isdigit and i.[2].isdigit():
            i.replace(i,None)

Unfortunately this not seem to function. 不幸的是，这似乎不起作用。 Any help would be appreciated. 任何帮助，将不胜感激。

Answer 1

You can try this: 您可以尝试以下方法：

df.mask(df.apply(lambda r: r.str.contains('[a-zA-Z]{1}\d{2}')))

Output: 输出：

       A     B     C       D
0  Apple  Pear   NaN    John
1    Cat   NaN  Mary  Sponge
2    Hat   NaN   Bed     NaN

I like @coldspeed's stack too: 我也喜欢@coldspeed的堆栈：

df[~df.stack().str.contains('[a-zA-Z]{1}\d{2}').unstack()]

Output: 输出：

       A     B     C       D
0  Apple  Pear   NaN    John
1    Cat   NaN  Mary  Sponge
2    Hat   NaN   Bed     NaN

Answer 2

Use stack and str.extract with a pattern that does not match what you want to match (this way, they're replaced with NaNs). 使用stack和str.extract的模式与您要匹配的模式不匹配（这样，它们将被NaN取代）。

df.stack().str.extract(r'(^[^a-z]\D{2}.*)').unstack()[0]

       A     B     C       D
0  Apple  Pear   NaN    John
1    Cat   NaN  Mary  Sponge
2    Hat   NaN   Bed     NaN

从Pandas DF删除以字母和两个数字开头的条目

问题描述

2 个解决方案

解决方案1
1 已采纳 2019-03-06 20:46:53

解决方案2
1 2019-03-06 20:47:22

从Pandas DF删除以字母和两个数字开头的条目

问题描述

2 个解决方案

解决方案1 1 已采纳 2019-03-06 20:46:53

解决方案2 1 2019-03-06 20:47:22

解决方案1
1 已采纳 2019-03-06 20:46:53

解决方案2
1 2019-03-06 20:47:22