![](/img/trans.png)
[英]Multiple Histograms, each for a label of x-axis, on the same graph matplotlib
[英]How to Iterate over multiple DataFrames and plot histograms for each feature with each data set's feature in the same graph
我有兩個數據框:df1 & df2
df1
Age BsHgt_M BsWgt_Kg GOAT-MBOAT4_F_BM TCF7L2_M_BM UCP2_M_BM
23.0 1.84 113.0 -1.623634 0.321379 0.199183
23.0 1.68 113.9 -1.073523 -0.957523 0.549469
24.0 1.60 86.4 -0.270883 -0.004106 1.479865
20.0 1.59 99.2 -0.218071 0.568458 -0.398410
df2
Age BsHgt_M BsWgt_Kg GOAT-MBOAT4_F_BM TCF7L2_M_BM UCP2_M_BM
29.0 1.94 123.0 -1.623676 0.321379 0.199183
30.0 1.61 113.9 -1.073523 -0.957523 0.549469
44.0 1.30 56.4 -0.270883 -0.004106 1.479865
30.0 1.19 91.2 -0.218071 0.568458 -0.398410
在這里,我嘗試遍歷每一列,plot 為 df1 的每一列創建一個直方圖,我可以使用以下代碼:
import matplotlib.pyplot as plt
fig, axs = plt.subplots(len(df1.columns), figsize=(10,50))
for n, col in enumerate(df1.columns):
df1[col].hist(ax=axs[n],legend=True)
但是,我必須迭代兩個 DataFrames 和 plot 直方圖,以便在同一個圖中查看每個特征的直方圖與每個數據框的特征,或者具有相同比例的並排直方圖也可以
所需 plot
直方圖子圖:
df1['Age'] vs df2['Age']
df1['BsHgt_M'] vs df2['BsHgt_M']
.
.
.
誰能告訴我如何做到這一點
IIUC,您可以為兩個數據框分配一個名為ID
的新列,該列可用於您的圖例以區分直方圖。 然后,您可以使用pd.concat
逐行連接您的數據幀。 之后,您只需要定義軸和圖形並遍歷所有列,除了新分配的列和 plot 使用seaborn
的直方圖,同時區分分配的變量。 這種區別的實現在seaborn
中很簡單,只需使用參數hue
。
可能的代碼:
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
# Note: next time when asking something on SO, please provide data as code like this,
# it makes it easier for the community to replicate your problem and to help you
df1 = pd.DataFrame({
"Age": [23, 23, 24, 20],
"BsHgt_M": [1.84, 1.68, 1.6, 1.59],
"BsWgt_Kg": [113, 113.9, 86.4, 99.2],
"GOAT-MBOAT4_F_BM": [-1.623634, -1.073523, -0.270883, -0.218071],
"TCF7L2_M_BM": [0.321379, -0.957523, -0.004106, 0.568458],
"UCP2_M_BM": [0.199183, 0.549469, 1.479865, -0.398410]
})
df2 = pd.DataFrame({
"Age": [29, 30, 44, 30],
"BsHgt_M": [1.94, 1.61, 1.3, 1.19],
"BsWgt_Kg": [123, 113.9, 56.4, 91.2],
"GOAT-MBOAT4_F_BM": [-1.623676, -1.073523, -0.270883, -0.218071],
"TCF7L2_M_BM": [0.321379, -0.957523, -0.004106, 0.549469],
"UCP2_M_BM": [0.199183, 0.5499, 1.479865, -0.398410]
})
df1["ID"] = "df1"
df2["ID"] = "df2"
df = pd.concat([df1, df2]).reset_index(drop=True)
cols = df1.columns[:-1]
assert (cols == df2.columns[:-1]).all()
fig, ax = plt.subplots((len(cols)), figsize=(6, 14), sharex=False)
for i, col in enumerate(cols):
sns.histplot(data=df, x=col, hue="ID", ax=ax[i])
if i > 0: ax[i].legend(list(), frameon=False)
ax[i].set_ylabel(col)
sns.move_legend(ax[0], "upper left", bbox_to_anchor=(1, 1))
ax[-1].set_xlabel("")
plt.show()
此代碼繪制所有列的直方圖。
對於兩列,它看起來有點像這樣:
如果需要,可以輕松調整樣式和形式。 這只是您的問題的可能解決方案的一個示例,應僅作為指導。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.