简体   繁体   中英

How to create a Pandas dataframe from another column in a dataframe by splitting it?

I have the following source dataframe

Person Country Is Rich?
0 US Yes
1 India No
2 India Yes
3 US Yes
4 US Yes
5 India No
6 US No
7 India No

I need to convert it another dataframe for plotting a bar graph like below for easily accessing data

Bar chart of economic status per country

Data frame to be created is like below.

Country Rich Poor
US 3 1
India 1 3

I am new to Pandas and Exploratory data science. Please help here

You can try pivot_table

df['Is Rich?'] = df['Is Rich?'].replace({'Yes': 'Rich', 'No': 'Poor'})
out = df.pivot_table(index='Country', columns='Is Rich?', values='Person', aggfunc='count')
print(out)

Is Rich?  Poor  Rich
Country
India        3     1
US           1     3

You could do:

converted = df.assign(Rich=df['Is Rich?'].eq('Yes')).eval('Poor = ~Rich').groupby('Country').agg({'Rich': 'sum', 'Poor': 'sum'})

print(converted)
         Rich  Poor
Country            
India       1     3
US          3     1

However, if you want to plot it as a barplot, the following format might work best with a plotting library like seaborn :

plot_df = converted.reset_index().melt(id_vars='Country', value_name='No. of people', var_name='Status')
print(plot_df)
  Country Status  No. of people
0   India   Rich              1
1      US   Rich              3
2   India   Poor              3
3      US   Poor              1

Then, with seaborn :

import seaborn as sns

sns.barplot(x='Country', hue='Status', y='No. of people', data=plot_df)

Resulting plot:

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM