I have a data frame like this
Htno Subname marks credits
15mq1a0501 abc 43 3
15mq1a0501 xyz 55 6
15mq1a0502 abc 56 3
15mq1a0502 xyz 60 6
15mq1a0503 abc 10 0
15mq1a0503 xyz 56 6
now i need a data frame to be converted like this
Htno abc xyz Totalmarks Totalcredits
15mq1a0501 43 55 98 9
15mq1a0502 56 60 116 9
15mq1a0503 10 56 66 6
I used dplyr
package but I am not able to do so.
You could use the following:
require(tidyverse)
df %>%
spread(Subname, marks) %>%
group_by(HTno) %>%
summarise(abc = max(abc, na.rm = T), xyz = max(xyz, na.rm = T), Totalcredits = sum(credits)) %>%
mutate(Totalmarks = abc + xyz)
The result would be:
HTno abc xyz Totalcredits Totalmarks
<fctr> <dbl> <dbl> <dbl> <dbl>
1 15mq1a0501 43 55 9 98
2 15mq1a0502 56 60 9 116
3 15mq1a0503 10 56 6 66
Just an alternative that only uses dplyr
functions. Note that the solution can be tedious when Subname
has many factors. See if others can have a more general solution.
library(magrittr)
library(dplyr)
df %>% group_by(Htno) %>%
summarize(abc = marks[Subname == "abc"],
xyz = marks[Subname == "xyz"],
Totalmarks = sum(marks),
Totalcredits = sum(credits))
Edit: The below generalisation works, but it is more complicated and needs tidyr::spread
.
library(magrittr)
library(dplyr)
library(tidyr)
df_1 <- df %>% group_by(Htno) %>%
summarize(Totalmarks = sum(marks),
Totalcredits = sum(credits))
df_2 <- df %>% select(-credits) %>% spread(Subname, marks) %>%
group_by(Htno) %>% summarize_each(funs(mean))
left_join(df_2, df_1, by = "Htno", all = TRUE)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.