简体   繁体   English

如何在python中处理数据矩阵?

[英]How to manipulate a matrix of data in python?

I'd like to create code that can read create a histogram from a matrix of data that contains information about movies. 我想创建一个代码,该代码可以读取包含电影信息的数据矩阵来创建直方图。 The data set (matrix) contains several columns, and I'm interested in the column that contains movie release years and another column that says whether or not they pass the bechtel test (the data set defines "Pass" and "Fail" as indicators of whether a movie passed or failed the test). 数据集(矩阵)包含几列,我对包含电影发行年份的列感兴趣,而另一列表明它们是否通过了bechtel测试(该数据集将“通过”和“失败”定义为指标)电影是否通过了测试)。 Knowing the nth column number of these two columns (release year and pass/fail), how can I create a histogram of the movies that fail the test, with the x axis containing bins of movie years? 知道这两列的第n列编号(发行年份和通过/失败),如何创建未通过测试的电影的直方图,并且x轴包含电影年份的bin? The bin sizes are not too important, whatever pyplot defaults to would be fine. bin的大小不是太重要,无论pyplot默认是什么都可以。

What I can do (which is not a lot) is this: 我能做的(不是很多)是这样的:

plt.hist(year_by_Test_binary[:,0])

which creates a pretty but meaningless histogram of how many movies were released in bins of years (the matrix has years in the 0th column). 这会创建一个漂亮但无意义的直方图,显示以年为单位的电影放映数量(矩阵在第0列中包含年)。

If you couldn't already tell, I am python-illiterate and struggling. 如果您还不能确定的话,那么我是python文盲,正在挣扎。 Any help would be appreciated. 任何帮助,将不胜感激。

Assuming n is the column of the Bechdel test, and that your data is numpy like: 假设n是Bechdel测试的列,并且您的数据为numpy,例如:

plt.hist([matrix[matrix[:,n] == 'Pass', 0], matrix[matrix[:,n] == 'Fail', 0]])

We're giving numpy two vectors of years, one with movies passing and one with movies failing. 我们给numpy两个年份的向量,一个是电影过去的,另一个是电影失败的。 It will then create two histograms for each category, so you can visually identify changes to the categories. 然后,它将为每个类别创建两个直方图,因此您可以直观地识别类别的更改。

for to convert a data to an matrix use : 用于将数据转换为矩阵,请使用:

numpy.asarray(data)

and to present in a histogram you can use : 并可以使用直方图显示:

plt.plot(data)

or 要么

plt.hist(data, bins)

bins is the niveau of your data 垃圾箱是您数据的关键

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM