[英]Dataframe long to wide format
给定以下 dataframe
df <- structure(list(Group = c("A", "A", "B", "B", "C", "C", "D", "D",
"E", "E"), TEST_A = c("neg", "neg", "neg", "neg", "neg", "neg",
"neg", "pos", "pos", "neg"), TEST_B = c("neg", "neg", "neg",
"neg", "neg", "neg", "neg", "neg", "neg", "neg"), RESULT_NOW = c(0L,
0L, 0L, NA, 0L, 0L, 0L, 0L, 0L, 0L), RESULTS_BEFORE = c(0L, 0L,
0L, NA, 0L, 0L, 0L, 0L, 0L, 0L)), row.names = c(NA, 10L), class = "data.frame")
看起来如下:
> df
Group TEST_A TEST_B RESULT_NOW RESULTS_BEFORE
1 A neg neg 0 0
2 A neg neg 0 0
3 B neg neg 0 0
4 B neg neg NA NA
5 C neg neg 0 0
6 C neg neg 0 0
7 D neg neg 0 0
8 D pos neg 0 0
9 E pos neg 0 0
10 E neg neg 0 0
我想创建以下 dataframe:
Group TEST_A_1 TEST_A_2 TEST_B_1 TEST_B_1 RESULT_NOW_1 RESULT_NOW_2 RESULTS_BEFORE_1 RESULTS_BEFORE_2
A neg neg neg neg 0 0 0 0
B neg neg neg neg 0 NA 0 NA
C neg neg neg neg 0 0 0 0
...
E pos neg neg neg 0 0 0 0
所以实际上是使用 Group 列将 dataframe 移动到宽格式。 请注意,组列每组最多可包含 4 行。 在此示例中,每组只有两行 感谢您的帮助。
你可以试试这个:
library(tidyverse)
library(reshape2)
#Melt
df2 <- reshape2::melt(df,id.vars = "Group")
df2 %>% group_by(Group,variable) %>%mutate(index=1:length(variable),variable=paste0(variable,'.',index)) %>%
select(-index) -> df2
#Pivot
df2 %>%
group_by(variable) %>%
mutate(row = row_number()) %>%
pivot_wider(id_cols = c(row, Group),
names_from = variable,
values_from = c(value)) %>%
select(-row)
# A tibble: 5 x 9
Group TEST_A.1 TEST_A.2 TEST_B.1 TEST_B.2 RESULT_NOW.1 RESULT_NOW.2 RESULTS_BEFORE.1 RESULTS_BEFORE.2
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 A neg neg neg neg 0 0 0 0
2 B neg neg neg neg 0 <NA> 0 <NA>
3 C neg neg neg neg 0 0 0 0
4 D neg pos neg neg 0 0 0 0
5 E pos neg neg neg 0 0 0 0
我希望这可以帮助你。
诀窍是首先添加一个 id 列。 在第二步中,我们可以简单地应用tidyr::pivot_wider
来实现所需的结果:
df <- structure(list(Group = c("A", "A", "B", "B", "C", "C", "D", "D",
"E", "E"), TEST_A = c("neg", "neg", "neg", "neg", "neg", "neg",
"neg", "pos", "pos", "neg"), TEST_B = c("neg", "neg", "neg",
"neg", "neg", "neg", "neg", "neg", "neg", "neg"), RESULT_NOW = c(0L,
0L, 0L, NA, 0L, 0L, 0L, 0L, 0L, 0L), RESULTS_BEFORE = c(0L, 0L,
0L, NA, 0L, 0L, 0L, 0L, 0L, 0L)), row.names = c(NA, 10L), class = "data.frame")
library(dplyr)
library(tidyr)
df %>%
group_by(Group) %>%
mutate(id = row_number(Group)) %>%
ungroup() %>%
pivot_wider(names_from = id, values_from = c(TEST_A, TEST_B, RESULT_NOW, RESULTS_BEFORE))
#> # A tibble: 5 x 9
#> Group TEST_A_1 TEST_A_2 TEST_B_1 TEST_B_2 RESULT_NOW_1 RESULT_NOW_2
#> <chr> <chr> <chr> <chr> <chr> <int> <int>
#> 1 A neg neg neg neg 0 0
#> 2 B neg neg neg neg 0 NA
#> 3 C neg neg neg neg 0 0
#> 4 D neg pos neg neg 0 0
#> 5 E pos neg neg neg 0 0
#> # ... with 2 more variables: RESULTS_BEFORE_1 <int>, RESULTS_BEFORE_2 <int>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.