[英]Is the barplot in matplotlib using the mean?
I have a dataset df
: 我有一个数据集df
:
users number
user1 1
user2 34
user3 56
user4 45
user5 4
user1 3
user5 11
user1 3
when making a barplot like this: 当制作这样的barplot时:
plt.bar(x['users'], x['number'].sort_values(ascending=False), color="blue")
Does it take the mean of every user
in the number
column during the plot? 在绘制过程中,它是否取number
列中每个user
的平均值? What if I want the sum of all the numbers in the number
column to appear in the barplot in descending order? 如果我希望number
列中所有数字的总和以降序出现在条形图中,该怎么办?
I tried this: 我尝试了这个:
plt.bar(x['users'], x['number'].sum().sort_values(ascending=False), color="blue")
which gives: 这使:
AttributeError: 'numpy.float64' object has no attribute 'sort_values'
code: 码:
import pandas as pd
df = pd.DataFrame({'number': [10,34,56,45,33],
'user': ['user1','user2','user3','user4','user1']})
#index=['user1','user2','user3','user4','user1'])
plt.bar(df['user'], df['number'], color="blue")
It always keeps the biggest value for the user that has many values. 对于具有许多价值的用户,它始终保持最大价值。
I am not sure if this is what you want OR do you want to first groupby
the values for each user and then plot the total numbers in descending order. 我不确定这是您想要的还是要首先对每个用户的值进行groupby
,然后按降序绘制总数。
x = x.sort_values('number',ascending=False)
plt.bar(range(len(x['users'])), x['number'], color="blue")
plt.xticks(range(len(x['users'])), x['users'])
plt.ylabel('Numbers')
Output 产量
If you want to plot the mean of each user, use the following code: 如果要绘制每个用户的平均值,请使用以下代码:
x1 = x.groupby('users').mean().reset_index()
plt.bar(range(len(x1)), x1['number'], color="blue")
plt.xticks(range(len(x1)), x1['users'])
plt.ylabel('Mean')
Output 产量
What if you don't sort or group by : All bars are present but you don't see the different bars for same x-value because alpha=1
by default. 如果您不进行排序或分组时该怎么办 :所有条形都存在,但是对于相同的x值您看不到不同的条形,因为默认情况下alpha=1
。 I used alpha=0.2
to highlight my point. 我使用alpha=0.2
突出了我的观点。 Now you see that at user1
there are two bars behind each other. 现在,您看到在user1
有两个相互user1
小节。
import pandas as pd
df = pd.DataFrame({'number': [10,34,56,45,51], 'user': 'user1','user2','user3','user4','user1']})
plt.bar(df['user'], df['number'], color="blue", linewidth =2, edgecolor='black' , alpha = 0.2)
Output 产量
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.