简体   繁体   English

如何选择行中至少一个元素中包含特定值的行?

[英]How to select the rows that contain a specific value in at least one of the elements in a row?

I have a DataFrame DF and a list, say List1 . 我有一个DataFrame DF和一个列表,例如List1 List1 is created from the DF and it has the elements present in DF but without repetitions. List1是从DF创建的,它具有DF存在的元素,但没有重复。 I need to do the following: 我需要执行以下操作:
1. Select the rows of DF that contain a specific element from List1 (for instance, iterating all the elements in List1 ) 1.从List1选择包含特定元素的DF行(例如,迭代List1所有元素)
2. Re-index them from 0 to whatever the number of rows are because the rows selected may have non continuous indices. 2.将它们从0重新索引到任意行数,因为选择的行可能具有不连续的索引。

SAMPLE INPUT: 样本输入:

List1=['Apple','Orange','Banana','Pineapple','Pear','Tomato','Potato']
Sample DF
  EQ1      EQ2      EQ3
0 Apple    Orange   NaN
1 Banana   Potato   NaN
2 Pear     Tomato   Pineapple
3 Apple    Tomato   Pear
4 Tomato   Potato   Banana

Now if I want access to the rows that contain Apple , those would be 0 and 3. But I'd like them renamed as 0 and 1(Re-indexing). 现在,如果我要访问包含Apple的行,它们将为0和3。但是我希望将它们重命名为0和1(重新索引)。 After Apple is searched, the next element from List1 should be taken and similar steps are to be carried out. 搜索Apple之后,应采用List1的下一个元素,并执行类似的步骤。 I have other operations to perform after this, so I need to loop the whole process throughout List1 . 此后,我还有其他操作要执行,因此需要在整个List1循环整个过程。 I hope I have explained it well and here is my codelet for the same, which is not working: 我希望我已经很好地解释了,这是我的相同代码,它无法正常工作:

for eq in List1:
    MCS=DF.loc[MCS_Simp_green[:] ==eq] #Indentation was missing
    MCS= MCS.reset_index(drop=True)
    <Remaining operations>

I think you need isin with any : 我认为您需要isinany

List1=['Apple','Orange','Banana','Pineapple','Pear','Tomato','Potato']

for eq in List1:
    #print df.isin([eq]).any(1)
    #print df[df.isin([eq]).any(1)]
    df1 = df[df.isin([eq]).any(1)].reset_index(drop=True)  
    print df1

     EQ1     EQ2   EQ3
0  Apple  Orange   NaN
1  Apple  Tomato  Pear
     EQ1     EQ2  EQ3
0  Apple  Orange  NaN
      EQ1     EQ2     EQ3
0  Banana  Potato     NaN
1  Tomato  Potato  Banana
    EQ1     EQ2        EQ3
0  Pear  Tomato  Pineapple
     EQ1     EQ2        EQ3
0   Pear  Tomato  Pineapple
1  Apple  Tomato       Pear
      EQ1     EQ2        EQ3
0    Pear  Tomato  Pineapple
1   Apple  Tomato       Pear
2  Tomato  Potato     Banana
      EQ1     EQ2     EQ3
0  Banana  Potato     NaN
1  Tomato  Potato  Banana

For storing values you can use dict comprehension: 要存储值,可以使用dict理解:

dfs = {eq: df[df.isin([eq]).any(1)].reset_index(drop=True) for eq in List1}

print dfs['Apple']
     EQ1     EQ2   EQ3
0  Apple  Orange   NaN
1  Apple  Tomato  Pear

print dfs['Orange']
     EQ1     EQ2  EQ3
0  Apple  Orange  NaN

You can identify the items in the list and collect the resulting new DataFrame s like so: 您可以标识list的项目并收集生成的新DataFrame如下所示:

data_frames = {}
for l in List1:
    data_frames[l] = df[df.isin([l]).any(1)].reset_index(drop=True)
    print(l, data_frames[l].index.tolist())

to get: 要得到:

Apple [0, 1]
Orange [0]
Banana [0, 1]
Pineapple [0]
Pear [0, 1]
Tomato [0, 1, 2]
Potato [0, 1]

The new DataFrame objects are contained in the dictionary data_frames : 新的DataFrame对象包含在dictionary data_frames

data_frames['Apple']

     EQ1     EQ2   EQ3
0  Apple  Orange   NaN
1  Apple  Tomato  Pear

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何获取在行(列表)中的至少一个元素中包含特定值(字符串)的列表(行)数? - How to obtain the number of lists(rows) that contain a specific value(string) in at least one of the elements in a row(list)? Pandas select 行在至少一列中具有特定值 - Pandas select rows with a specific value in at least one column 如何在 select 行中至少有一个分类值 pandas DataFrame - How to select rows with at least one categorical value in pandas DataFrame 熊猫-如何选择其中包含特定值的行 - Pandas - how to select rows which contain specific value in it 如何打印 pandas 行中 select 特定值之后的所有行? - How print all rows after select specific value in row by pandas? 如果列中至少有一个特定值,则删除所有行 - Drop all rows if there is at least one specific value in column 如果至少有一个值低于阈值,则选择给定2列的行 - Select rows given 2 columns, if at least one value is below threshold 选择列列表中至少有一个值不为空的行 - Select rows where at least one value from the list of columns is not null 如果所有行都包含至少一个负数元素,则更改矩阵中元素的符号 - Change the signs of the elements in a matrix if all rows contain at least one negative element 删除行中任何位置包含特定值的行(Pandas,Python 3) - Remove Rows that Contain a specific Value anywhere in the row (Pandas, Python 3)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM