[英]Pandas - plot cumulative proportion of column
I have a dataframe column that is either 0, 1, or 2. I want to plot the relative proportions over time in a stacked bar chart. 我有一个dataframe列,它是0、1或2。我想在堆积的条形图中绘制一段时间内的相对比例。 Eg if the values are: 例如,如果值是:
0 1 2 2 0 0 1
Then the 0 - % value would be (rounded to 1.dp): 然后将0-%值(四舍五入为1.dp):
100 50 33 25 40 50 42
And the 1 - % value would be (again rounded to 1.dp): 并且1-%的值将是(再次舍入为1.dp):
0 50 33 25 20 33
I would want the 0, 1 and 2 proportions to all be stacked in a single bar, showing how the relative proportion changes over time. 我希望将0、1、2比例全部堆叠在一个栏中,以显示相对比例如何随时间变化。
Okay first i need to make the obligatory complain that you did not provide any attempts you made so far, shame on you ;). 好吧,首先我要强制性地抱怨您到目前为止没有提供任何尝试,可耻的是;)。
Nevertheless, let us help you. 不过,让我们为您服务。 First one should break this task into little steps. 首先应该将这项任务分成几步。 We need to: 1. Create Indicator-Column for each value 2. We need the Cumsum for each of these 3. Divide it by the respective row number (+1 since indexing starts at 0 ) 4. Plot this beautiful thing 我们需要:1.为每个值创建指标列2.我们需要为每个值求和3.将其除以相应的行号(因为索引从0开始,所以为+1)4.绘制漂亮的东西
my attempt would be - not beautful , but brute force coding - : 我的尝试是-不是很漂亮,而是蛮力的编码-:
# Create Example Data
df = pd.DataFrame(np.random.randint(0,4, 10), columns=['A'] )
# The function to make it one go
def create_rolling_stack(df, column):
# Create the Indicators also called OneHotEncoding or DummyEncoding
dum = pd.get_dummies(df[column])
# build cumsum
cums = dum.cumsum()
# reset index
cums = cums.reset_index(drop=True)
# create the divisior
cums['div'] = cums.index.values +1
# ugly but divde each column by the respective row number
for col in cums.columns:
cums[col] = cums[col]/cums['div']
cums = cums.drop('div', axis = 1)
# Plot this awesome thing, note that stacked is set to True
cums.plot(kind= 'bar', stacked = True )
plt.show()
Hope it helps 希望能帮助到你
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.