熊猫：根据更复杂的标准选择和修改数据框

Question

I was looking at this and this threads, and though my question is not so different, it has a few differences. 我正在看这个和这个线程，虽然我的问题没有那么不同，但它有一些差异。 I have a dataframe full of floats , that I want to replace by strings. 我有一个充满floats的数据帧，我想用字符串替换。 Say: 说：

      A     B       C
 A    0     1.5     13
 B    0.5   100.2   7.3
 C    1.3   34      0.01

To this table I want to replace by several criteria, but only the first replacement works: 对于这个表我想用几个标准替换，但只有第一个替换工作：

df[df<1]='N' # Works
df[(df>1)&(df<10)]#='L' # Doesn't work
df[(df>10)&(df<50)]='M'  # Doesn't work
df[df>50]='H'  # Doesn't work

If I instead do the selection for the 2nd line based on float , still doesn't work: 如果我改为基于float进行第二行的选择，仍然不起作用：

((df.applymap(type)==float) & (df<10) & (df>1)) #Doesn't work

I was wondering how to apply pd.DataFrame().mask in here, or any other way. 我想知道如何在这里或任何其他方式应用pd.DataFrame().mask 。 How should I solve this? 我该怎么解决这个问题？

Alternatively, I know I may read column by column and apply the substitutions on each series, but this seems a bit counter productive 或者，我知道我可以逐列阅读并在每个系列中应用替换，但这似乎有点适得其反

Edit: Could anyone explain why the 4 simple assignments above do not work? 编辑：任何人都可以解释为什么上面的4个简单分配不起作用？

Answer 1

Use numpy.select with DataFrame constructor: 将numpy.select与DataFrame构造函数一起使用：

m1 = df < 1
m2 = (df>1)&(df<10)
m3 = (df>10)&(df<50)
m4 = df>5

vals = list('NLMH')

df = pd.DataFrame(np.select([m1,m2,m3,m4], vals), index=df.index, columns=df.columns)
print (df)
   A  B  C
A  N  L  M
B  N  H  L
C  L  M  N

Answer 2

By using pd.cut 通过使用pd.cut

pd.cut(df.stack(),[-1,1,10,50,np.inf],labels=list('NLMH')).unstack()
Out[309]: 
   A  B  C
A  N  L  M
B  N  H  L
C  L  M  N

Answer 3

You can use searchsorted 您可以使用searchsorted

Copy 复制

labels = np.array(list('NLMH'))
breaks = np.array([1, 10, 50])
pd.DataFrame(
    labels[breaks.searchsorted(df.values)].reshape(df.shape),
    df.index, df.columns)

   A  B  C
A  N  L  M
B  N  H  L
C  L  M  N

In Place 到位

labels = np.array(list('NLMH'))
breaks = np.array([1, 10, 50])
df[:] = labels[breaks.searchsorted(df.values)].reshape(df.shape)
df

   A  B  C
A  N  L  M
B  N  H  L
C  L  M  N

Chained pure Pandas approach with `pandas.DataFrame.mask` 链式纯`pandas.DataFrame.mask`方法与`pandas.DataFrame.mask`

Deprecated since version 0.21 从版本0.21开始不推荐使用

df.mask(df.lt(1), 'N').mask(df.gt(1) & df.lt(10), 'L') \
  .mask(df.gt(10) & df.lt(50), 'M').mask(df.gt(50), 'H')

   A  B  C
A  N  L  M
B  N  H  L
C  L  M  N

熊猫：根据更复杂的标准选择和修改数据框

问题描述

3 个解决方案

解决方案1
12 2018-05-30 13:09:58

解决方案2
7 2018-05-30 13:34:38

解决方案3
6 已采纳 2018-05-30 13:27:45

Copy 复制

In Place 到位

Chained pure Pandas approach with `pandas.DataFrame.mask` 链式纯`pandas.DataFrame.mask`方法与`pandas.DataFrame.mask`

熊猫：根据更复杂的标准选择和修改数据框

问题描述

3 个解决方案

解决方案1 12 2018-05-30 13:09:58

解决方案2 7 2018-05-30 13:34:38

解决方案3 6 已采纳 2018-05-30 13:27:45

Copy 复制

In Place 到位

Chained pure Pandas approach with pandas.DataFrame.mask 链式纯pandas.DataFrame.mask方法与pandas.DataFrame.mask

解决方案1
12 2018-05-30 13:09:58

解决方案2
7 2018-05-30 13:34:38

解决方案3
6 已采纳 2018-05-30 13:27:45

Chained pure Pandas approach with `pandas.DataFrame.mask` 链式纯`pandas.DataFrame.mask`方法与`pandas.DataFrame.mask`