按行中的值从 pandas pivot 表中过滤

Question

I created a (large) sparse matrix by a pivot table.我通过 pivot 表创建了一个（大）稀疏矩阵。

UserId                                                               ...   
1         5.0   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN  ...   
2         NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN  ...   
3         NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN  ...   
4         NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN  ...   
5         NaN   NaN   NaN   NaN   NaN   2.0   NaN   NaN   NaN   NaN  ...   
...       ...   ...   ...   ...   ...   ...   ...   ...   ...   ...  ...   
6036      NaN   NaN   NaN   2.0   NaN   3.0   NaN   NaN   NaN   NaN  ...   
6037      NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN  ...   
6038      NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN  ...   
6039      NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN  ...   
6040      3.0   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN  ...   

MovieId  3943  3944  3945  3946  3947  3948  3949  3950  3951  3952  
UserId                                                               
1         NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN  
2         NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN  
3         NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN  
4         NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN  
5         NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN  
...       ...   ...   ...   ...   ...   ...   ...   ...   ...   ...  
6036      NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN  
6037      NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN  
6038      NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN  
6039      NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN  
6040      NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN

Now, I am looking for a way for, given a row index (eg 1) select all index whose values are > 4.0.现在，我正在寻找一种方法，给定一个行索引（例如 1）select 的所有索引值 > 4.0。 Is there a simple way to do so?.有没有一种简单的方法可以做到这一点？ I tried the following我尝试了以下

df.loc[1] >= 4.0

however what I get is但是我得到的是

MovieId
1        True
2       False
3       False
4       False
5       False
        ...  
3948    False
3949    False
3950    False
3951    False
3952    False
Name: 1, Length: 3706, dtype: bool

meaning I am almost there, but not quite.意思是我快到了，但不完全。 How do I extract the indices corresponding to True ?如何提取与True对应的索引？

Answer 1

You can chain two loc selections, the first selects the rows based on label, the second will use a function to subset the columns based on your condition.您可以链接两个loc选择，第一个选择基于 label 的行，第二个将使用 function 根据您的条件对列进行子集化。 Or you could use a single nested loc, where the columns mask also calls .loc或者您可以使用单个嵌套 loc，其中列掩码也调用.loc

import numpy as np
import pandas as pd

np.random.seed(42)
df = pd.DataFrame(np.random.choice([1, np.NaN, 5], p=[.2, .7, .1], size=(2, 40)))

df.loc[1].loc[lambda x: x >= 4]
#or 
df.loc[1, df.loc[1] >= 4]

#3     5.0
#10    5.0
#12    5.0
#15    5.0
#29    5.0
#Name: 1, dtype: float64

按行中的值从 pandas pivot 表中过滤

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-04-29 16:17:13

按行中的值从 pandas pivot 表中过滤

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-04-29 16:17:13

解决方案1
2 已采纳 2020-04-29 16:17:13