[英]Stacking columns by pair in R
id <- rep(c(1, 2, 3, 4), 3) name <- rep(c("a", "b", "c", "d"), 3) variable_a <- c(1:4, 9:12, 17:20) variable_b <- c(5:8, 13:16, 21:24) test1 <- data.frame(id, name, variable_a, variable_b)
我有一個如下所示的數據集:
id <- c(1, 2, 3, 4)
name <- c("a", "b", "c", "d")
variable_1a <- c(1, 2, 3, 4)
variable_1b <- c(5, 6, 7, 8)
variable_2a <- c(9, 10, 11, 12)
variable_2b <- c(13, 14, 15, 16)
variable_3a <- c(17, 18, 19, 20)
variable_3b <- c(21, 22, 23, 24)
test <- data.frame(id, name,
variable_1a, variable_1b,
variable_2a, variable_2b,
variable_3a, variable_3b)
id name variable_1a variable_1b variable_2a variable_2b variable_3a variable_3b
1 1 a 1 5 9 13 17 21
2 2 b 2 6 10 14 18 22
3 3 c 3 7 11 15 19 23
4 4 d 4 8 12 16 20 24
我正在嘗試將每個列對(1a/1b、2a/2b、3a/3b)相互復制粘貼,同時重復 id 和 names 列。 總之,我想要一個這樣的數據集:
id <- rep(c(1, 2, 3, 4), 3)
name <- rep(c("a", "b", "c", "d"), 3)
variable_a <- c(1:4, 9:12, 17:20)
variable_b <- c(5:8, 13:16, 21:24)
test1 <- data.frame(id, name, variable_a, variable_b)
id name variable_a variable_b
1 1 a 1 5
2 2 b 2 6
3 3 c 3 7
4 4 d 4 8
5 1 a 9 13
6 2 b 10 14
7 3 c 11 15
8 4 d 12 16
9 1 a 17 21
10 2 b 18 22
11 3 c 19 23
12 4 d 20 24
我嘗試了各種融合和行綁定選項,但我無法讓它們以這種列對方式工作。 任何想法將是一個有用的命令? 謝謝!
具有reshape
功能的基本 R 選項
reshape(
setNames(test,gsub("(\\d)(.)","\\2.\\1",names(test))),
direction = "long",
idvar = c("id","name"),
varying = -(1:2)
)
給
id name time variable_a variable_b
1.a.1 1 a 1 1 5
2.b.1 2 b 1 2 6
3.c.1 3 c 1 3 7
4.d.1 4 d 1 4 8
1.a.2 1 a 2 9 13
2.b.2 2 b 2 10 14
3.c.2 3 c 2 11 15
4.d.2 4 d 2 12 16
1.a.3 1 a 3 17 21
2.b.3 2 b 3 18 22
3.c.3 3 c 3 19 23
4.d.3 4 d 3 20 24
使用tidyverse
的替代解決方案:
library(tidyverse)
test %>% pivot_longer(variable_1a:variable_3b, names_to = c("var_a", "var_b"), names_sep = "_", values_to = "val") %>%
mutate(c = if_else(var_b %in% c("1a", "2a", "3a"), "var1", "var2")) %>%
pivot_wider(c("id", "name"), names_from = c, values_from = val) %>%
unnest(cols = c("var1", "var2"))
# A tibble: 12 x 4
id name var1 var2
<dbl> <chr> <dbl> <dbl>
1 1 a 1 5
2 1 a 9 13
3 1 a 17 21
4 2 b 2 6
5 2 b 10 14
6 2 b 18 22
7 3 c 3 7
8 3 c 11 15
9 3 c 19 23
10 4 d 4 8
11 4 d 12 16
12 4 d 20 24
您擁有的列名 ( variable_1a
) 和您想要的列名 ( variable_a
) 在 ( 1
) 之間有一個數字。 我們可以從列名中刪除該數字,然后使用pivot_longer
:
names(test) <- sub('\\d+', '', names(test))
tidyr::pivot_longer(test,
cols = starts_with('variable'),
names_to = '.value')
# id name variable_a variable_b
# <dbl> <chr> <dbl> <dbl>
# 1 1 a 1 5
# 2 1 a 9 13
# 3 1 a 17 21
# 4 2 b 2 6
# 5 2 b 10 14
# 6 2 b 18 22
# 7 3 c 3 7
# 8 3 c 11 15
# 9 3 c 19 23
#10 4 d 4 8
#11 4 d 12 16
#12 4 d 20 24
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.