简体   繁体   中英

How to extract index/column/data from Pandas DataFrame Based on logic operation?

I have the following dataframe:

import numpy as np
import pandas as pd
data = np.random.rand(5,5)
df = pd.DataFrame(data, index = list('abcde'), columns = list('ABCDE'))
df = df[df>0]
df
          A         B         C         D   E
a       NaN  2.038740  1.371158       NaN NaN
b  0.575567       NaN  0.462007       NaN NaN
c  0.984802  0.049818  0.129836       NaN NaN
d       NaN       NaN       NaN       NaN NaN
e  0.789563  1.846402       NaN  0.340902 NaN

I want to get all the (index, col_name, value) of the non-NAN data. How do I do it?

My expected result is:

[('b','A', 0.575567), ('c', 'A', 0.984802), ('e', 'A', 0.789563),...]

You can stack the data frame, which will drop NA values automatically and then reset the index to be columns, after which it will be easy to convert to a list of tuples:

[tuple(r) for r in df.stack().reset_index().values]

# [('a', 'B', 2.03874),
#  ('a', 'C', 1.371158),
#  ('b', 'A', 0.575567),
#  ('b', 'C', 0.46200699999999995),
#  ('c', 'A', 0.9848020000000001),
#  ('c', 'B', 0.049818),
#  ('c', 'C', 0.12983599999999998),
#  ('e', 'A', 0.789563),
#  ('e', 'B', 1.846402),
#  ('e', 'D', 0.340902)]

Or use the data frames' to_records() method:

list(df.stack().reset_index().to_records(index = False))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM