简体   繁体   中英

Transform chatacter column into multiple columns, one per category, and instead of character value report occurences [R]

I have a following dataframe:

df1 <- structure(list(name = c("ene", "due", "rabe", "rabe", "kum", 
"kum", "kum", "rike", "smake"), type = c("a", "b", "d", "a", 
"c", "c", "b", "d", "a")), class = "data.frame", row.names = c(NA, 
-9L))

And I would like to transform it to the following dataframe:

df2 <- structure(list(name = c("ene", "due", "rabe", "kum", "rike", 
"smake"), type_a = c(1, 0, 1, 0, 0, 1), type_b = c(0, 1, 0, 1, 
0, 0), type_c = c(0, 0, 0, 2, 0, 0), type_d = c(0, 0, 1, 0, 1, 
0)), class = "data.frame", row.names = c(NA, -6L))

Basically I want to split "type" column for as many columns as categories stored with the original one. Also, instead of character values I would like to count the occurences of each category per name.

How to do it in R?

EDIT:I tried to do so with spread from tidyr, but it throws an error due to non-unique combination of keys.

You could take advantage of different arguments of pivot_wider to construct the contingency table.

library(tidyr)

pivot_wider(df1, names_from = type, names_sort = TRUE, names_prefix = 'type_',
            values_from = type, values_fn = length, values_fill = 0)

# # A tibble: 6 × 5
#   name  type_a type_b type_c type_d
#   <chr>  <int>  <int>  <int>  <int>
# 1 ene        1      0      0      0
# 2 due        0      1      0      0
# 3 rabe       1      0      0      1
# 4 kum        0      1      2      0
# 5 rike       0      0      0      1
# 6 smake      1      0      0      0
library(dplyr) library(tidyr) df1 %>% count(name, type) %>% pivot_wider(names_from = type, values_from = n, values_fill = 0) %>% rename_with(~ paste0("type_", .x))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM