I have been trying to select a subset of a correlation matrix using the Pandas Python library. For instance, if I had a matrix like
0 A B C
A 1 2 3
B 2 1 4
C 3 4 1
I might want to select a matrix where some of the variables in the original matrix are correlated with some of the other variables, like :
0 A C
A 1 3
C 3 1
To do this, I tried using the following code to slice the original correlation matrix using the names of the desired variables in a list, transpose the correlation matrix, reassign the original column names, and then slice again.
data = pd.read_csv("correlationmatrix.csv")
initial_vertical_axis = pd.DataFrame()
for x in var_list:
a = data[x]
initial_vertical_axis = initial_vertical_axis.append(a)
print initial_vertical_axis
initial_vertical_axis = pd.DataFrame(data=initial_vertical_axis, columns= var_list)
initial_matrix = pd.DataFrame()
for x in var_list:
a = initial_vertical_axis[x]
initial_matrix = initial_matrix.append(a)
print initial_matrix
However, this returns an empty correlation matrix with the right row and column labels but no data like
0 A C
A
C
I cannot find the error in my code that would lead to this. If there is a simpler way to go about this, I am open to suggestions.
Suppose data
contains your matrix,
In [122]: data
Out[122]:
A B C
0
A 1 2 3
B 2 1 4
C 3 4 1
In [123]: var_list = ['A','C']
In [124]: data.loc[var_list,var_list]
Out[124]:
A C
0
A 1 3
C 3 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.