Pandas correlation matrix not incorporating all columns in csv file

Question

I have a csv file consisting of 12 data columns and want to create a correlation matrix with them. However, when doing this using pandas, only 4 (seemingly random) columns are incorporated. Any ideas why the remaining columns don't make it into the correlation matrix?

d = pd.read_csv('national_raw_convictions.csv')
cm = d.corr().abs()
cm.to_csv('national_raw_convictions_correlation.csv')

I have attached a screenshot of both the input (left) and output (right) csv files referenced. 1

Answer 1

It is not enough info to be sure.

My guess would be that columns have an object data type. While reading data, pandas try its best to understand columns data type. But if for some reason column has numbers and strings, column data type will be 'object'. To check dataframe data types, you can run d.dtypes .

Hope it could help.

Pandas correlation matrix not incorporating all columns in csv file

Question

1 answers

solution1
0 2020-04-05 17:04:32

Pandas correlation matrix not incorporating all columns in csv file

Question

1 answers

solution1 0 2020-04-05 17:04:32

solution1
0 2020-04-05 17:04:32