I have the following data frame in Pandas. The idea is to generate an additional data frame IDs based on the proportion of the variable TYPE, transposing it into columns. Any help is appreciated!
d = {'ID': [1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2], 'TYPE': ['A','A','A','B','B','B','B','C','C','C','A','A','B','B','B','B','B','B']}
df = pd.DataFrame(data=d)
df
ID A B C
1 0.30 0.40 0.3
2 0.25 0.75 0.0
Use SeriesGroupBy.value_counts
with parameter normalize=True
and reshape by Series.unstack
:
df = df.groupby('ID')['TYPE'].value_counts(normalize=True).unstack(fill_value=0)
print (df)
TYPE A B C
ID
1 0.30 0.40 0.3
2 0.25 0.75 0.0
Then if necessary column from index
:
df = df.rename_axis(None, axis=1).reset_index()
print (df)
ID A B C
0 1 0.30 0.40 0.3
1 2 0.25 0.75 0.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.