简体   繁体   English

如何组合两个直方图python

[英]How to combine two histograms python

male[['Gender','Age']].plot(kind='hist', x='Gender', y='Age', bins=50)
female[['Gender','Age']].plot(kind='hist', x='Gender', y='Age', bins=50)

So basically, I used data from a file to create two histograms based on gender and age. 基本上,我使用文件中的数据根据​​性别和年龄创建两个直方图。 From the beginning I separated the data by gender to initially plot. 从一开始我就按性别将数据分开到最初的情节。 Now i'm having a hard time putting the two histograms together. 现在我很难将两个直方图放在一起。

As mentioned in the comment, you can use matplotlib to do this task. 如评论中所述,您可以使用matplotlib来执行此任务。 I haven't figured out how to plot two histogram using Pandas tho (would like to see how people have done that). 我还没弄明白如何使用Pandas tho绘制两个直方图(想看看人们是如何做到的)。

import matplotlib.pyplot as plt
import random

# example data
age = [random.randint(20, 40) for _ in range(100)]
sex = [random.choice(['M', 'F']) for _ in range(100)]

# just give a list of age of male/female and corresponding color here
plt.hist([[a for a, s in zip(age, sex) if s=='M'], 
          [a for a, s in zip(age, sex) if s=='F']], 
         color=['b','r'], alpha=0.5, bins=10)
plt.show()

Consider converting the dataframes to a two-column numpy matrix as matplotlib 's hist works with this structure instead of two different length pandas dataframes with non-numeric columns. 考虑将数据帧转换为两列numpy矩阵,因为matplotlibhist使用此结构而不是两个不同长度的pandas数据帧和非数字列。 Pandas' join is used to bind the two columns, MaleAge and FemaleAge . Pandas的join用于绑定两个列, MaleAgeFemaleAge

Here, the Gender indicator is removed and manually labeled according to the column order. 此处, 性别指标将被删除并根据列顺序手动标记。

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt

...
# RESET INDEX AND RENAME COLUMN AFTER SUBSETTING
male = df2[df2['Gender'] == "M"].reset_index(drop=True).rename(columns={'Age':'MaleAge'})
female = df2[df2['Gender'] == "F"].reset_index(drop=True).rename(columns={'Age':'FemaleAge'})

# OUTER JOIN TO ACHIEVE SAME LENGTH
gendermat = np.array(male[['MaleAge']].join(female[['FemaleAge']], how='outer'))

plt.hist(gendermat, bins=50, label=['male', 'female'])
plt.legend(loc='upper right')
plt.show()
plt.clf()
plt.close()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM