Plotting number of occurrences of column value

Question

I hope the title is accurate enough, I wasn't quite sure how to phrase it.

Anyhow, my problem is that I have a Pandas df which looks like the following:

                              Customer       Source  CustomerSource
0                                Apple            A             141
1                                Apple            B              36
2                            Microsoft            A             143
3                               Oracle            C             225
4                                  Sun            C             151

This is a df derived from a greater dataset, and the meaning the value of CustomerSource is that it's the accumulated sum of all occurrences of Customer and Source , for example, in this case there is 141 occurrences of Apple with Soure A and 225 of Customer Oracle with Source B and so on.

What I want to do with this, is I want to do a stacked barplot which gives me all Customer s on the x-axis and the values of CustomerSource stacked on top of each other on the y-axis. Similar to the below example. Any hints as to how I would proceed with this?

Answer 1

You can use pivot or unstack for reshape and then DataFrame.bar :

df.pivot('Customer','Source','CustomerSource').plot.bar(stacked=True)

df.set_index(['Customer','Source'])['CustomerSource'].unstack().plot.bar(stacked=True)

Or if duplicates in pairs Customer , Source use pivot_table or groupby with aggregate sum :

print (df)
    Customer Source  CustomerSource
0      Apple      A             141 <-same Apple, A
1      Apple      A             200 <-same Apple, A
2      Apple      B              36
3  Microsoft      A             143
4     Oracle      C             225
5        Sun      C             151

df = df.pivot_table(index='Customer',columns='Source',values='CustomerSource', aggfunc='sum')
print (df)
Source         A     B      C
Customer                     
Apple      341.0  36.0    NaN <-141 + 200 = 341
Microsoft  143.0   NaN    NaN
Oracle       NaN   NaN  225.0
Sun          NaN   NaN  151.0


df.pivot_table(index='Customer',columns='Source',values='CustomerSource', aggfunc='sum')
  .plot.bar(stacked=True)

df.groupby(['Customer','Source'])['CustomerSource'].sum().unstack().plot.bar(stacked=True)

Also is possible swap columns:

df.pivot('Customer','Source','CustomerSource').plot.bar(stacked=True)

df.pivot('Source', 'Customer','CustomerSource').plot.bar(stacked=True)

Plotting number of occurrences of column value

Question

1 answers

solution1
3 ACCPTED 2017-09-05 12:25:35

Plotting number of occurrences of column value

Question

1 answers

solution1 3 ACCPTED 2017-09-05 12:25:35

solution1
3 ACCPTED 2017-09-05 12:25:35