简体   繁体   中英

Summarize and Transpose rows to columns in R

This is my input data:

Program = c("A","A","A","B","B","C")
Age = c(10,30,30,12,32,53)
Gender = c("F","F","M","M","M","F")
Language = c("Eng","Eng","Kor","Kor","Other","Other")
df = data.frame(Program,Age,Gender,Language)

I would like to output a table like this:

Program MEAN AGE ENG KOR FEMALE MALE
A
B
C

Where MEAN AGE is the average age, ENG,KOR,FEMALE,MALE are counts.

I have tried using dplyr and t() but in this case I feel like I'm completely lost as to what are the steps (my first post, new to this). Thank you in advance!

You can take the following approach:

library(dplyr)

df %>%
  group_by(Program) %>%
  summarise(
    `Mean Age` = mean(Age),
    ENG = sum(Language=="Eng"),
    KOR = sum(Language=="Kor"),
    Female = sum(Gender=="F"),
    Male = sum(Gender=="M"),
    .groups="drop"
  )

Output:

# A tibble: 3 x 6
  Program `Mean Age`   ENG   KOR Female  Male
  <chr>        <dbl> <int> <int>  <int> <int>
1 A             23.3     2     1      2     1
2 B             22       0     1      0     2
3 C             53       0     0      1     0

Note: .groups is a special variable for dplyr functions. The way it's used here is equivalent to using %>% ungroup() after the calculation. If you type any other name in the summarise function, it will assume it's a column name.

In base R you could do:

df1 <- cbind(df[1:2], stack(df[3:4])[-2])
cbind(aggregate(Age~Program, df, mean),as.data.frame.matrix(table(df1[-2])))
  Program      Age Eng F Kor M Other
A       A 23.33333   2 2   1 1     0
B       B 22.00000   0 0   1 2     1
C       C 53.00000   0 1   0 0     1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM