[英]reshaping data from wide to long, but with some complexity
我做了最小的可重現示例,但我的真實數據龐大而復雜 這是示例
fact_1_p_model1 <- c(1,3,4,2,5)
ra_2_p_model1<- c(5,6,4,2,3)
da_1_p_model2 <- c(3,5,3,1,5)
dd_2_p_model2 <- c( 4,2,5,2,1)
fact_1_p_nonlinearmodel1<-c( 4,2,5,2,2)
tt_2_p_nonlinearmodel1<-c( 3,6,5,3,1)
fact_1_p_nonlinearmodel2<-c( 1,2,6,2,4)
rara_2_p_nonlinearmodel2<-c( 9,5,5,2,1)
id<-1:5
data<-data.frame(fact_1_p_model1, ra_2_p_model1, da_1_p_model2, dd_2_p_model2,
fact_1_p_nonlinearmodel1, tt_2_p_nonlinearmodel1, fact_1_p_nonlinearmodel2,
rara_2_p_nonlinearmodel2,id)
所以,目前,我有一個這樣的數據集
data
fact_1_p_model1 ra_2_p_model1 da_1_p_model2 dd_2_p_model2 fact_1_p_nonlinearmodel1
1 1 5 3 4 4
2 3 6 5 2 2
3 4 4 3 5 5
4 2 2 1 2 2
5 5 3 5 1 2
tt_2_p_nonlinearmodel1 fact_1_p_nonlinearmodel2 rara_2_p_nonlinearmodel2 id
1 3 1 9 1
2 6 2 5 2
3 5 6 5 3
4 3 2 2 4
5 1 4 1 5
而且,我想用兩個引導列(“model”,“coef”)將這些數據制作成長格式
model <- c("model1","model1","model2","model2","nonlinearmodel1","nonlinearmodel1",
"nonlinearmodel2","nonlinearmodel2")
coef <- c("fact_1_p_","ra_2_p_","da_1_p_","dd_2_p_","fact_1_p_","tt_2_p_","fact_1_p_","rara_2_p_")
#value <- ?? don't know how to...
#id <- ??
data_long<-data.frame(model,coef
#,value, id
)
如果我排除 value 和 id,就是這樣。 但我也想輸入 value 和 id,並且我手動完成了,但我無法為我的真實數據手動完成。
> data_long
model coef
1 model1 fact_1_p_
2 model1 ra_2_p_
3 model2 da_1_p_
4 model2 dd_2_p_
5 nonlinearmodel1 fact_1_p_
6 nonlinearmodel1 tt_2_p_
7 nonlinearmodel2 fact_1_p_
8 nonlinearmodel2 rara_2_p_
有了這個小數據集,我可以手動完成。 但是對於我真正的海量數據,我做不到。
我怎樣才能做到這一點? 我如何將寬數據(如第一個所示)重塑為長數據?
data %>%
pivot_longer(-id, names_to = c('coef', 'model'), names_sep = '(?<=_p_)')
# A tibble: 40 x 4
id coef model value
<int> <chr> <chr> <dbl>
1 1 fact_1_p_ model1 1
2 1 ra_2_p_ model1 5
3 1 da_1_p_ model2 3
4 1 dd_2_p_ model2 4
5 1 fact_1_p_ nonlinearmodel1 4
6 1 tt_2_p_ nonlinearmodel1 3
7 1 fact_1_p_ nonlinearmodel2 1
8 1 rara_2_p_ nonlinearmodel2 9
9 2 fact_1_p_ model1 3
10 2 ra_2_p_ model1 6
# ... with 30 more rows
你可以試試
library(dplyr)
library(tidyr)
data %>%
reshape2::melt(id = 'id') %>%
separate(variable, c("model", "coef"), "p_") %>%
mutate(model = paste0(model, "p_"))
id model coef value
1 1 fact_1_p_ model1 1
2 2 fact_1_p_ model1 3
3 3 fact_1_p_ model1 4
4 4 fact_1_p_ model1 2
5 5 fact_1_p_ model1 5
6 1 ra_2_p_ model1 5
7 2 ra_2_p_ model1 6
8 3 ra_2_p_ model1 4
9 4 ra_2_p_ model1 2
10 5 ra_2_p_ model1 3
11 1 da_1_p_ model2 3
12 2 da_1_p_ model2 5
13 3 da_1_p_ model2 3
14 4 da_1_p_ model2 1
15 5 da_1_p_ model2 5
16 1 dd_2_p_ model2 4
17 2 dd_2_p_ model2 2
18 3 dd_2_p_ model2 5
19 4 dd_2_p_ model2 2
20 5 dd_2_p_ model2 1
21 1 fact_1_p_ nonlinearmodel1 4
22 2 fact_1_p_ nonlinearmodel1 2
23 3 fact_1_p_ nonlinearmodel1 5
24 4 fact_1_p_ nonlinearmodel1 2
25 5 fact_1_p_ nonlinearmodel1 2
26 1 tt_2_p_ nonlinearmodel1 3
27 2 tt_2_p_ nonlinearmodel1 6
28 3 tt_2_p_ nonlinearmodel1 5
29 4 tt_2_p_ nonlinearmodel1 3
30 5 tt_2_p_ nonlinearmodel1 1
31 1 fact_1_p_ nonlinearmodel2 1
32 2 fact_1_p_ nonlinearmodel2 2
33 3 fact_1_p_ nonlinearmodel2 6
34 4 fact_1_p_ nonlinearmodel2 2
35 5 fact_1_p_ nonlinearmodel2 4
36 1 rara_2_p_ nonlinearmodel2 9
37 2 rara_2_p_ nonlinearmodel2 5
38 3 rara_2_p_ nonlinearmodel2 5
39 4 rara_2_p_ nonlinearmodel2 2
40 5 rara_2_p_ nonlinearmodel2 1
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.