i have a dataset that contains video game platforms, and the year that games were released for it.
what i'm trying to do is end up with a dataframe that has the count of titles for each year released by platform.
my initial dataframe looks like this:
platform year
0 Wii 2006.0
1 NES 1985.0
2 Wii 2008.0
3 Wii 2009.0
4 GB 1996.0
5 GB 1989.0
6 DS 2006.0
7 Wii 2006.0
8 Wii 2009.0
9 NES 1984.0
10 DS 2005.0
11 DS 2005.0
12 GB 1999.0
13 Wii 2007.0
14 X360 2010.0
15 Wii 2009.0
16 PS3 2013.0
17 PS2 2004.0
18 SNES 1990.0
19 DS 2005.0
i'm using a groupby to get them together:
df = df.sort_values(['year']).groupby(['year'])['platform'].value_counts()
which gets me close:
year platform
1980.0 2600 9
1981.0 2600 46
1982.0 2600 36
1983.0 2600 11
NES 6
1984.0 NES 13
2600 1
1985.0 NES 11
2600 1
DS 1
but this is a series, and with the year being the index i can't stick this into something like a heatmap.
here is an example of the desired output:
year platform #_titles
1980 2600 9
1981 2600 46
1982 2600 36
1983 2600 11
1983 NES 6
1984 NES 13
1984 2600 1
1985 NES 11
1985 2600 1
1985 DS 1
1985 PC 1
1986 NES 19
1986 2600 2
1987 NES 10
1987 2600 6
1988 NES 11
1988 2600 2
1988 GB 1
1988 PC 1
1989 GB 10
I was thinking i might need to use a pivot_table() but this is something i am still quite new to and am struggling to implement.
i tried something like:
df = df.pivot_table(df,index='year',columns = 'platform',aggfunc = 'count')
but my output then is just the year.
clearly i am doing something wrong, and figure it is time to stop beating my virtual head on juypter notebook and ask for some advice.
I am fine with getting the original group method to work, or using a pivot table either way - I just would appreciate some pointers on what i'm doing wrong so i can correct it.
Thanks for your time in advance,
Jared
edit: here is the result from the first answer (which would be perfect, if it had the aggfunc in it? not sure why that isn't there?): |year|platform| |----|--------| |1980.0|2600| |1981.0|2600| |1982.0|2600| |1983.0|2600 ||NES| |1984.0|2600| ||NES|
Here is the solution with pivot table:
res = pd.pivot_table(df,index=['year', 'platform'],aggfunc = 'size')
>>> print(res)
year platform
1984.0 NES 1
1985.0 NES 1
1989.0 GB 1
1990.0 SNES 1
1996.0 GB 1
1999.0 GB 1
2004.0 PS2 1
2005.0 DS 3
2006.0 DS 1
Wii 2
2007.0 Wii 1
2008.0 Wii 1
2009.0 Wii 3
2010.0 X360 1
2013.0 PS3 1
Maybe this is what you want? Hard to tell since your output doesn't match the input.
df.sort_values(['year']).groupby(['year','platform']).size().reset_index(name='#_titles')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.