简体   繁体   English

Pandas 相关矩阵未包含 csv 文件中的所有列

[英]Pandas correlation matrix not incorporating all columns in csv file

I have a csv file consisting of 12 data columns and want to create a correlation matrix with them.我有一个包含 12 个数据列的 csv 文件,并想用它们创建一个相关矩阵。 However, when doing this using pandas, only 4 (seemingly random) columns are incorporated.但是,当使用 pandas 执行此操作时,仅包含 4 个(看似随机的)列。 Any ideas why the remaining columns don't make it into the correlation matrix?任何想法为什么剩余的列不进入相关矩阵?

d = pd.read_csv('national_raw_convictions.csv')
cm = d.corr().abs()
cm.to_csv('national_raw_convictions_correlation.csv')

I have attached a screenshot of both the input (left) and output (right) csv files referenced.我附上了引用的输入(左)和 output(右)csv 文件的屏幕截图。 1 1

It is not enough info to be sure.没有足够的信息来确定。

My guess would be that columns have an object data type.我的猜测是列具有 object 数据类型。 While reading data, pandas try its best to understand columns data type.在读取数据时,pandas 尽力理解列数据类型。 But if for some reason column has numbers and strings, column data type will be 'object'.但是如果由于某种原因列有数字和字符串,列数据类型将是“对象”。 To check dataframe data types, you can run d.dtypes .要检查 dataframe 数据类型,您可以运行d.dtypes

Hope it could help.希望它可以帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM