[英]How to create a new variable that shows different combinations of 4 dummy variables?
我有 4 個虛擬變量取值 0 或 1,對應於是否采用某種技術。 數據框有超過 14000 行。
我想遍歷這 4 列,將 == 1 的不同組合提供給一個新變量。
structure(list(tech1 = structure(c(2L, 1L, 1L, 1L), .Label = c("0", "1"), class = "factor"), tech2 = structure(c(2L, 2L, 2L, 2L), .Label = c("0", "1"), class = "factor"), tech3 = structure(c(1L, 1L, 2L, 1L), .Label = c("0", "1"), class = "factor"), tech4 = structure(c(1L, 1L, 2L, 1L), .Label = c("0", "1"), class = "factor")), row.names = c(NA, 4L), class = "data.frame")
由於可能有不同的組合,我的新變量應該包含每行表示哪些技術的信息,即 4 種技術中,每種情況下采用了哪些技術。
以下是新變量的前四行最后的樣子(假設“12”=采用的技術 1 和 2 ,依此類推):
變量“技術”:
structure(list(Tech = structure(c(1L, 2L, 3L, 4L), .Label = c("12", "2", "234", "2"), class = "factor")),row.names = c(NA, 4L), class = "data.frame")
我已經看到了一些可以工作的功能(例如聚合),但到目前為止我還沒有找到解決方案。
在不知道您想要的結尾 state 是什么的情況下,使用apply
function 您可以逐行生成每列中的 1 的列表和逐列中的 1 的列表。
m <- matrix(sample(0:1, 100, replace = TRUE), ncol = 4)
rows <- apply(m, 1, function(x) which(x == 1))
cols <- apply(m, 2, function(x) which(x == 1))
繼 SteveM 之后:
data.frame(tech=apply(df, 1, function(x) paste(which(x==1), collapse="")))
tech
#1 12
#2 2
#3 234
#4 2
或者一個 tidyverse 方法:
df %>%
mutate(id=row_number()) %>%
pivot_longer(tech1:tech4) %>%
filter(value==1) %>%
group_by(id) %>%
summarise(Tech=paste(gsub("tech", "", name), collapse = ""))
# A tibble: 4 x 2
# id Tech
# <int> <chr>
#1 1 12
#2 2 2
#3 3 234
#4 4 2
library(tidyverse)
(df <- tribble(
~dum1, ~dum2, ~dum3, ~dum4, ~value,
F, T, F, T, 12,
T, T, F, F, 20,
F, T, F, T, 32,
T, F, T, F , 27))
(
df
%>% mutate(dum1 = ifelse(dum1, "1", ""),
dum2 = ifelse(dum2, "2", ""),
dum3 = ifelse(dum3, "3", ""),
dum4 = ifelse(dum4, "4", ""),
which_tech = paste0(dum1, dum2, dum3, dum4))
)
# A tibble: 4 x 6
dum1 dum2 dum3 dum4 value which_tech
<chr> <chr> <chr> <chr> <dbl> <chr>
1 "" "2" "" "4" 12 24
2 "1" "2" "" "" 20 12
3 "" "2" "" "4" 32 24
4 "1" "" "3" "" 27 13
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.