简体   繁体   中英

R factor levels as column names and count values

I want to have factor Levels of different variables as column names and as the value the count per PatID. What I have is this:

data_sample <- data.frame(
  PatID   = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L),
  status1 = c("I250", "NA", "NA", "X560", "M206", "NA", "NA", "M206", "NA"),
  status2 = c(".", "M206", "NA", "I250", "I250", "M206", "NA", "NA", "X560"),
  status3 = c(".", "I250", "NA", "NA", "NA", "I250", "X560", "NA", "NA")
)

What I want is the following:

PatID I250 M206 X560
  1     2    1   0
  2     2    1   1
  3     1    2   2

Can anyone help? I tried dcast and others but the result never came

data_sample <- data.frame(
  PatID   = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L),
  status1 = c("I250", "NA", "NA", "X560", "M206", "NA", "NA", "M206", "NA"),
  status2 = c(".", "M206", "NA", "I250", "I250", "M206", "NA", "NA", "X560"),
  status3 = c(".", "I250", "NA", "NA", "NA", "I250", "X560", "NA", "NA")
)

library(tidyverse)
data_sample %>%
  gather(status_num, value, -PatID) %>%
  filter(value != "NA", value != ".") %>%
  count(PatID, value) %>%  # Improvement by @antoniosk 
  spread(value, n, fill = 0)

# A tibble: 3 x 4
# Groups:   PatID [3]
  PatID  I250  M206  X560
  <int> <int> <int> <int>
1     1     2     1    NA
2     2     2     1     1
3     3     1     2     2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM