简体   繁体   中英

Get index of rows which matches certain value for whole dataset?

I have Pandas DataFrame with floats.

Sample data:

        -100   -99   -98 ... 0    1    2    3 
-100    0.00  0.00  7.21    99  0.00   99   99
-99     0.00  0.00  7.21    99  0.00   99   99
-98     3.55  3.55  7.21    99  0.00   99   99
...
0       6.55  7.21  7.21    14  0.00   99   0.00
1       6.55  7.21  7.21    14  0.00  0.00  0.00
2       6.55  7.21  7.21    14  0.00  0.00  0.00
3       6.55  7.21  7.21    14  0.00  0.00  0.00

Name of columns are integers:

       df.columns 
[out]: Int64Index([-100,  -99,  -98,  -97,  -96,  -95,  -94,  -93,  -92,  -91,
        ...
          91,   92,   93,   94,   95,   96,   97,   98,   99,  100],
       dtype='int64', length=201)

The same for index:

       df.index 
[out]: Int64Index([-100,  -99,  -98,  -97,  -96,  -95,  -94,  -93,  -92,  -91,
        ...
          91,   92,   93,   94,   95,   96,   97,   98,   99],
       dtype='int64', length=200)

I'm trying to get columns and indices where highest value occurs (99) in this dataframe. For columns I used:

       columns_with_value = df.columns[(df == df.max().max()).iloc[0]]
       list(columns)
[out]: [0,2,3] 

and its working correctly (I checked manually in dataframe)

I would like to get the same output for index.

I tried:

index = df[df == df.max().max()].index.values.astype(int)

But it returns all of indexes from -100 to 99 which is no correct there are rows without maximal value.

I tried also with defining columns like in most typical example:

df.loc[df[columns_with_value] == df.max().max()]

And its returns ValueError: Cannot index with multidimensional key

The correct output for sample data would be:

[out]: [-100, -99, -98, 0] 

You can do stack

idx = df[df == df.max().max()].stack().index[0][0]

You can use np.where , if I undersand you correctly:

r, c = np.where(df == df.to_numpy().max())

This will return the index of every row and column in the dataframe that 99.

Now, using

indx = df.index[r]
cols = df.columns[c]

To get the labelled integers. And you, can zip to get (r,c) coordinates.

coords = list(zip(indx, cols))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM