简体   繁体   English

多个熊猫数据框列的重叠密度图

[英]Overlapping density plots of multiple pandas data frame columns

import numpy as np
import pandas as pd

col1 = np.random.normal(0, 1, (1000, ))
col2 = np.random.normal(0, 1, (1000, ))
col3 = np.random.normal(0, 1, (1000, ))
df = pd.DataFrame({'col1':col1, 'col2':col2, 'col3':col3})
  • Plot each column as a continuous line 将每列绘制为连续线
  • Plot all 3 columns on same axis 在同一轴上绘制所有3列
  • Use different colored lines (no fill) 使用不同的彩色线条(无填充)

Thanks in advance! 提前致谢!

I understood your question! 我了解您的问题! Here's how I would do it in matplotlib. 这是我在matplotlib中执行的方法。

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

col1 = np.random.normal(0, 1, (1000, ))
col2 = np.random.normal(1, 1, (1000, ))
col3 = np.random.normal(-1, 1, (1000, ))
df = pd.DataFrame({'col1':col1, 'col2':col2, 'col3':col3})

df['col1_bins'] = pd.cut(df['col1'], bins=np.arange(-10, 11, 0.5))
df['col2_bins'] = pd.cut(df['col2'], bins=np.arange(-10, 11, 0.5))
df['col3_bins'] = pd.cut(df['col3'], bins=np.arange(-10, 11, 0.5))

col1_counts = df[['col1_bins', 'col1']].groupby(['col1_bins']).count().reset_index()
col2_counts = df[['col2_bins', 'col1']].groupby(['col2_bins']).count().reset_index()
col3_counts = df[['col3_bins', 'col1']].groupby(['col3_bins']).count().reset_index()

plt.plot(col1_counts['col1_bins'].astype(str), col1_counts['col1'], 'r')
plt.plot(col2_counts['col2_bins'].astype(str), col2_counts['col1'], 'b')
plt.plot(col3_counts['col3_bins'].astype(str), col3_counts['col1'], 'g')

Basically you have to bin your data points before you can plot them. 基本上,您必须先对数据点进行分类,然后才能绘制它们。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM