如何在pandas DataFrame上用NaN替换整个单元格

Question

I want to replace the entire cell that contains the word as circled in the picture with blanks or NaN. 我想用空格或NaN替换包含图中带圆圈的单词的整个单元格。 However when I try to replace for example '1.25 Dividend' it turned out as '1.25 NaN'. 然而，当我尝试更换例如'1.25 Dividend'时，结果却是'1.25 NaN'。 I want to return the whole cell as 'NaN'. 我想把整个细胞归还为'NaN'。 Any idea how to work on this? 知道如何处理这个吗？

我的DataFrame

Answer 1

Option 1 选项1
Use a regular expression in your replace 在替换中使用正则表达式

df.replace('^.*Dividend.*$', np.nan, regex=True)

From comments 来自评论

(Using regex=True ) means that it will interpret the problem as a regular expression one. （使用regex=True ）意味着它会将问题解释为正则表达式。 You still need an appropriate pattern. 你仍然需要一个合适的模式。 The '^' says to start at the beginning of the string. '^'表示从字符串的开头开始。 '^.*' matches all characters from the beginning of the string. '^.*'匹配字符串开头的所有字符。 '$' says to end the match with the end of the string. '$'表示以字符串结尾结束匹配。 '.*$' matches all characters up to the end of the string. '.*$'匹配字符串末尾的所有字符。 Finally, '^.*Dividend.*$' matches all characters from the beginning, has 'Dividend' somewhere in the middle, then any characters after it. 最后， '^.*Dividend.*$'从头开始匹配所有字符，在中间某处有'Dividend' ，然后在它后面有任何字符。 Then replace this whole thing with np.nan 然后用np.nan替换整个东西

Consider the dataframe df 考虑数据帧df

df = pd.DataFrame([[1, '2 Dividend'], [3, 4], [5, '6 Dividend']])
df

   0           1
0  1  2 Dividend
1  3           4
2  5  6 Dividend

then the proposed solution yields 然后提出的解决方案产生

Option 2 选项2
Another alternative is to use pd.DataFrame.mask in conjunction with a applymap . 另一种方法是将pd.DataFrame.mask与applymap结合使用。
If I pass a lambda to applymap that identifies if any cell has 'Dividend' in it. 如果我将lambda传递给applymap ，以确定是否有任何单元格中有'Dividend' 。

df.mask(df.applymap(lambda s: 'Dividend' in s if isinstance(s, str) else False))

   0    1
0  1  NaN
1  3    4
2  5  NaN

Option 3 选项3
Similar in concept but using stack / unstack + pd.Series.str.contains 在概念上类似，但使用stack / unstack + pd.Series.str.contains

df.mask(df.stack().astype(str).str.contains('Dividend').unstack())

   0    1
0  1  NaN
1  3    4
2  5  NaN

Answer 2

替换所有字符串：

df.apply(lambda x: pd.to_numeric(x, errors='coerce'))

Answer 3

我会像这样使用applymap

df.applymap(lambda x: 'NaN' if (type(x) is str and 'Dividend' in x) else x)

如何在pandas DataFrame上用NaN替换整个单元格

问题描述

3 个解决方案

解决方案1
4 2017-07-06 16:28:06

解决方案2
0 2017-07-06 16:40:52

解决方案3
0 2017-07-06 17:43:31

如何在pandas DataFrame上用NaN替换整个单元格

问题描述

3 个解决方案

解决方案1 4 2017-07-06 16:28:06

解决方案2 0 2017-07-06 16:40:52

解决方案3 0 2017-07-06 17:43:31

解决方案1
4 2017-07-06 16:28:06

解决方案2
0 2017-07-06 16:40:52

解决方案3
0 2017-07-06 17:43:31