[英]How to select the rows that contain a specific value in at least one of the elements in a row?
I have a DataFrame
DF
and a list, say List1
. 我有一个
DataFrame
DF
和一个列表,例如List1
。 List1
is created from the DF
and it has the elements present in DF
but without repetitions. List1
是从DF
创建的,它具有DF
存在的元素,但没有重复。 I need to do the following: 我需要执行以下操作:
1. Select the rows of DF
that contain a specific element from List1
(for instance, iterating all the elements in List1
) 1.从
List1
选择包含特定元素的DF
行(例如,迭代List1
所有元素)
2. Re-index them from 0 to whatever the number of rows are because the rows selected may have non continuous indices. 2.将它们从0重新索引到任意行数,因为选择的行可能具有不连续的索引。
SAMPLE INPUT: 样本输入:
List1=['Apple','Orange','Banana','Pineapple','Pear','Tomato','Potato']
Sample DF
EQ1 EQ2 EQ3
0 Apple Orange NaN
1 Banana Potato NaN
2 Pear Tomato Pineapple
3 Apple Tomato Pear
4 Tomato Potato Banana
Now if I want access to the rows that contain Apple
, those would be 0 and 3. But I'd like them renamed as 0 and 1(Re-indexing). 现在,如果我要访问包含
Apple
的行,它们将为0和3。但是我希望将它们重命名为0和1(重新索引)。 After Apple
is searched, the next element from List1
should be taken and similar steps are to be carried out. 搜索
Apple
之后,应采用List1
的下一个元素,并执行类似的步骤。 I have other operations to perform after this, so I need to loop the whole process throughout List1
. 此后,我还有其他操作要执行,因此需要在整个
List1
循环整个过程。 I hope I have explained it well and here is my codelet for the same, which is not working: 我希望我已经很好地解释了,这是我的相同代码,它无法正常工作:
for eq in List1:
MCS=DF.loc[MCS_Simp_green[:] ==eq] #Indentation was missing
MCS= MCS.reset_index(drop=True)
<Remaining operations>
I think you need isin
with any
: 我认为您需要
isin
与any
:
List1=['Apple','Orange','Banana','Pineapple','Pear','Tomato','Potato']
for eq in List1:
#print df.isin([eq]).any(1)
#print df[df.isin([eq]).any(1)]
df1 = df[df.isin([eq]).any(1)].reset_index(drop=True)
print df1
EQ1 EQ2 EQ3
0 Apple Orange NaN
1 Apple Tomato Pear
EQ1 EQ2 EQ3
0 Apple Orange NaN
EQ1 EQ2 EQ3
0 Banana Potato NaN
1 Tomato Potato Banana
EQ1 EQ2 EQ3
0 Pear Tomato Pineapple
EQ1 EQ2 EQ3
0 Pear Tomato Pineapple
1 Apple Tomato Pear
EQ1 EQ2 EQ3
0 Pear Tomato Pineapple
1 Apple Tomato Pear
2 Tomato Potato Banana
EQ1 EQ2 EQ3
0 Banana Potato NaN
1 Tomato Potato Banana
For storing values you can use dict
comprehension: 要存储值,可以使用
dict
理解:
dfs = {eq: df[df.isin([eq]).any(1)].reset_index(drop=True) for eq in List1}
print dfs['Apple']
EQ1 EQ2 EQ3
0 Apple Orange NaN
1 Apple Tomato Pear
print dfs['Orange']
EQ1 EQ2 EQ3
0 Apple Orange NaN
You can identify the items in the list
and collect the resulting new DataFrame
s like so: 您可以标识
list
的项目并收集生成的新DataFrame
如下所示:
data_frames = {}
for l in List1:
data_frames[l] = df[df.isin([l]).any(1)].reset_index(drop=True)
print(l, data_frames[l].index.tolist())
to get: 要得到:
Apple [0, 1]
Orange [0]
Banana [0, 1]
Pineapple [0]
Pear [0, 1]
Tomato [0, 1, 2]
Potato [0, 1]
The new DataFrame
objects are contained in the dictionary
data_frames
: 新的
DataFrame
对象包含在dictionary
data_frames
:
data_frames['Apple']
EQ1 EQ2 EQ3
0 Apple Orange NaN
1 Apple Tomato Pear
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.