简体   繁体   中英

Python: How do I write a function to determine which variable in a dataframe has the highest absolute correlation with a specified column?

I would like to write a function to determine which variable in a dataframe has the highest absolute correlation with a specific column. However, I am having difficulty to get the column name from the correlation matrix.

Say that my data, df , is as following:

address size rent_price number_of_bathrooms number_of_rooms
East 12 3400 2 4
North East 99 4200 4 4
South 99 4000 5 5

I use ab_col_matrix = abs(df.corr()) to generate the correlation matrix something like, with column names at the top and the left-hand side of the matrix.

1 value value value 
value 1 value value 
value value 1 value 
value value value 1 

Say that I am interested in the highest correlated column to the size column. My idea is that I would sort the column and take the first row and return the column name with the highest value.

so I tried, sorted = ab_col_matrix.sort_values('size', ascending = False) \

then I tried to pick highest one, the sorted['size'][1] but it is only returning the value itself but not the column and I am puzzled how I could access that. Here I used [1] because [0] would return 1 which is the correlation value for its own column.

I would very much appreciate any help where I could gain more knowledge as to how to achieve this.

You can simply select the column for the variable you want and then sort the rows:

ab_col_matrix['size'].sort_values(ascending=False)

size                   1.000000
rent_price             0.970725
number_of_bathrooms    0.944911
number_of_rooms        0.500000
Name: size, dtype: float64

You can then select the highest correlated value with the following:

ab_col_matrix['size'].sort_values(ascending=False).index[1]

'rent_price'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM