如何使用熊猫创建多个直方图？

Question

I have a csv file with three columns: Full name, Test_A_Score, Test_B_Score. 我有一个包含三列的csv文件：全名，Test_A_Score，Test_B_Score。 Test_A_Score and Test_B_Score range from 0-10. Test_A_Score和Test_B_Score的范围是0-10。 My aim is for every unique value of Test_A_Score to create a histogram from the values of Test_B_Score. 我的目标是让Test_A_Score的每个唯一值从Test_B_Score的值创建直方图。

test_scores.csv

Full name      Test_A_Score Test_B_Score
Jake Johnson        5            8
Helen Smith         9            6
   .
   .
   .
Jonathan Pierce     3            8

My code so far: 到目前为止，我的代码：

import pandas as pd

df = pd.read_csv('test_scores.csv', delimiter=',',  na_values=['-']) 

# Get rid of missing scores
df = df[(df['Test_A_Score'] >= 0) & (df['Test_B_Score'] >= 0)]

score_range = range(11)

data = []
for score in score_range:
    scores = df[(df['Test_A_Score'] == score)]['Test_B_Score']
    data.append(scores)

df_hist = pd.DataFrame(data, columns=score_range)

So, I thought I would take the test B scores for the score_range, create a new dataframe, insert the data and plot the histograms of the columns with the following: 因此，我以为我要对score_range进行测试B分数，创建一个新的数据框，插入数据并使用以下内容绘制列的直方图：

import matplotlib.pyplot as plt

plt.figure()
scores_df.hist(color='k', alpha=0.5, bins=20)

The problems are that the scores for each value in score_range don't have the same length and the data need to be inserted as rows and not as columns like they are in the list named data. 问题在于，score_range中每个值的得分长度不同，并且数据需要插入行而不是列，就像它们在名为data的列表中一样。

Answer 1

first of all you should probably use the .dropna() function to get rid of non-sensible values. 首先，您可能应该使用.dropna()函数来摆脱不合理的值。 Next I think the groupby() function is your friend if you look for 'uniqueness'. 接下来，如果您要寻找“独特性”，那么我认为groupby()函数是您的朋友。

import pandas as pd
import matplotlib.pyplot as plt

frame = pd.DataFrame([['euler', 1, 3],
['gauss', 1, 5],
['fibo', 1, 6],
['schwartz', 2, 3],
['helmholtz', 2, 4],
['mandelbrodt', 3, 4]], columns=['Name','a','b'])

fig = plt.figure()
ax = [fig.add_subplot(1,3, i) for i in range(1,4)]

for index, (a, group) in enumerate(frame.groupby('a')):
    ax[index].hist(group.b.values)

the .groupby() returns you an iterator that gives you the groups name and the group itself. .groupby()返回一个迭代器，该迭代器为您提供组名和组本身。 You can then just plot a histogram of the b-values for every group. 然后，您可以为每个组绘制b值的直方图。

如何使用熊猫创建多个直方图？

问题描述

1 个解决方案

解决方案1
0 2014-11-13 16:25:57

如何使用熊猫创建多个直方图？

问题描述

1 个解决方案

解决方案1 0 2014-11-13 16:25:57

解决方案1
0 2014-11-13 16:25:57