I have the following dataframe:
import numpy as np
import pandas as pd
data = np.random.rand(5,5)
df = pd.DataFrame(data, index = list('abcde'), columns = list('ABCDE'))
df = df[df>0]
df
A B C D E
a NaN 2.038740 1.371158 NaN NaN
b 0.575567 NaN 0.462007 NaN NaN
c 0.984802 0.049818 0.129836 NaN NaN
d NaN NaN NaN NaN NaN
e 0.789563 1.846402 NaN 0.340902 NaN
I want to get all the (index, col_name, value) of the non-NAN data. How do I do it?
My expected result is:
[('b','A', 0.575567), ('c', 'A', 0.984802), ('e', 'A', 0.789563),...]
You can stack the data frame, which will drop NA values automatically and then reset the index to be columns, after which it will be easy to convert to a list of tuples:
[tuple(r) for r in df.stack().reset_index().values]
# [('a', 'B', 2.03874),
# ('a', 'C', 1.371158),
# ('b', 'A', 0.575567),
# ('b', 'C', 0.46200699999999995),
# ('c', 'A', 0.9848020000000001),
# ('c', 'B', 0.049818),
# ('c', 'C', 0.12983599999999998),
# ('e', 'A', 0.789563),
# ('e', 'B', 1.846402),
# ('e', 'D', 0.340902)]
Or use the data frames' to_records()
method:
list(df.stack().reset_index().to_records(index = False))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.