I have a csv file similar to below:
Name - Year - Genre - Sales
1 - 2005 - Action - 1
2 - 2005 - Action - 2
3 - 2005 - Shooter - 3
4 - 2006 - RPG - 2
5 - 2006 - RPG - 2
6 - 2007 - Action - 1
7 - 2007 - Shooter - 3
8 - 2007 - RPG - 2
...
My end goal is to make a sand chart in R that shows the total sales of each genre on the y axis and year on the x axis, with the labels being the genres.
I need to sum up the sales of each of the genres per year, for example 2005 sales would be Action:3, Shooter:3, RPG:0. And do this for every year.
This would eventually give me a data frame that looks like this:
Action Shooter RPG
2005 3 3 0
2006 0 0 4
2007 1 3 2
In Python, I could do this using enumerate, but I'm having a hard time figuring this out in R.
Here's what I have so far
vg <- read.csv("vgdata.csv")
genres <- unique(vg$Genre)
years <- sort(unique(vg$Year))
genredf <-data.frame(vg$Genre)
i<-0
for (year in (unique(vg$Year))) {
yeardata = rep(0,length(genres))
}
This would give me the data frame with 0s in it. Now I'm trying to add in the summation of the data so I can chart it.
Sorry for the poor formatting. I'm still new to stack overflow.
We could use xtabs
xtabs(Sales ~ Year + Genre, df1)
Here is a base R solution using reshape
+ aggregate
(but seems not as simple as the approach of xtabs
@akrun )
dfout <- reshape(aggregate(Sales~Year + Genre,df,sum),
direction = "wide",
idvar = "Year",
timevar = "Genre")
such that
> dfout
Year Sales.Action Sales.RPG Sales.Shooter
1 2005 3 NA 3
2 2007 1 2 3
3 2006 NA 4 NA
DATA
df <- structure(list(Name = 1:8, Year = c(2005L, 2005L, 2005L, 2006L,
2006L, 2007L, 2007L, 2007L), Genre = c("Action", "Action", "Shooter",
"RPG", "RPG", "Action", "Shooter", "RPG"), Sales = c(1L, 2L,
3L, 2L, 2L, 1L, 3L, 2L)), class = "data.frame", row.names = c(NA,
-8L))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.