How do i get the value from a dataframe based on a list of index and headers?
These are the dataframes i have:
a = pd.DataFrame([[1,2,3],[4,5,6],[7,8,9]], columns=['a','b','c'])
referencingDf = pd.DataFrame(['c','c','b'])
Based on the same index, i am trying to get the following dataframe output:
outputDf = pd.DataFrame([3,6,8])
Currently, i tried this but would need to take the diagonal values. Am pretty sure there is a better way of doing so:
a.loc[referencingDf.index.values, referencingDf[:][0].values]
You need lookup
:
b = a.lookup(a.index, referencingDf[0])
print (b)
[3 6 8]
df1 = pd.DataFrame({'vals':b}, index=a.index)
print (df1)
vals
0 3
1 6
2 8
IIUC, you can use df.get_value
in a list comprehension.
vals = [a.get_value(*x) for x in referencingDf.reset_index().values]
# a simplification would be [ ... for x in enumerate(referencingDf[0])] - DYZ
print(vals)
[3, 6, 8]
And then, construct a dataframe.
df = pd.DataFrame(vals)
print(df)
0
0 3
1 6
2 8
Another way to use list comprehension:
vals = [a.loc[i,j] for i,j in enumerate(referencingDf[0])]
# [3, 6, 8]
Here's one vectorized approach that uses column_index
and then NumPy's advanced-indexing
for indexing and extracting those values off each row of dataframe -
In [177]: col_idx = column_index(a, referencingDf.values.ravel())
In [178]: a.values[np.arange(len(col_idx)), col_idx]
Out[178]: array([3, 6, 8])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.