簡體   English   中英

將數據從寬變長,但有些復雜

[英]reshaping data from wide to long, but with some complexity

我做了最小的可重現示例,但我的真實數據龐大而復雜 這是示例


fact_1_p_model1 <- c(1,3,4,2,5)
ra_2_p_model1<- c(5,6,4,2,3)
da_1_p_model2 <- c(3,5,3,1,5)
dd_2_p_model2 <- c( 4,2,5,2,1)
fact_1_p_nonlinearmodel1<-c( 4,2,5,2,2)
tt_2_p_nonlinearmodel1<-c( 3,6,5,3,1)
fact_1_p_nonlinearmodel2<-c( 1,2,6,2,4)
rara_2_p_nonlinearmodel2<-c( 9,5,5,2,1)
id<-1:5
data<-data.frame(fact_1_p_model1, ra_2_p_model1, da_1_p_model2, dd_2_p_model2,
                 fact_1_p_nonlinearmodel1, tt_2_p_nonlinearmodel1, fact_1_p_nonlinearmodel2,
                 rara_2_p_nonlinearmodel2,id)

所以,目前,我有一個這樣的數據集

 data
  fact_1_p_model1 ra_2_p_model1 da_1_p_model2 dd_2_p_model2 fact_1_p_nonlinearmodel1
1               1             5             3             4                        4
2               3             6             5             2                        2
3               4             4             3             5                        5
4               2             2             1             2                        2
5               5             3             5             1                        2
  tt_2_p_nonlinearmodel1 fact_1_p_nonlinearmodel2 rara_2_p_nonlinearmodel2 id
1                      3                        1                        9  1
2                      6                        2                        5  2
3                      5                        6                        5  3
4                      3                        2                        2  4
5                      1                        4                        1  5



而且,我想用兩個引導列(“model”,“coef”)將這些數據制作成長格式


model <- c("model1","model1","model2","model2","nonlinearmodel1","nonlinearmodel1",
           "nonlinearmodel2","nonlinearmodel2")
coef <- c("fact_1_p_","ra_2_p_","da_1_p_","dd_2_p_","fact_1_p_","tt_2_p_","fact_1_p_","rara_2_p_")

#value <- ?? don't know how to... 
#id <- ??

data_long<-data.frame(model,coef
#,value, id
)

如果我排除 value 和 id,就是這樣。 但我也想輸入 value 和 id,並且我手動完成了,但我無法為我的真實數據手動完成。

> data_long
            model      coef
1          model1 fact_1_p_
2          model1   ra_2_p_
3          model2   da_1_p_
4          model2   dd_2_p_
5 nonlinearmodel1 fact_1_p_
6 nonlinearmodel1   tt_2_p_
7 nonlinearmodel2 fact_1_p_
8 nonlinearmodel2 rara_2_p_

有了這個小數據集,我可以手動完成。 但是對於我真正的海量數據,我做不到。

我怎樣才能做到這一點? 我如何將寬數據(如第一個所示)重塑為長數據?

data %>%
   pivot_longer(-id, names_to = c('coef', 'model'), names_sep = '(?<=_p_)')

# A tibble: 40 x 4
      id  coef     model           value
   <int> <chr>     <chr>           <dbl>
 1     1 fact_1_p_ model1              1
 2     1 ra_2_p_   model1              5
 3     1 da_1_p_   model2              3
 4     1 dd_2_p_   model2              4
 5     1 fact_1_p_ nonlinearmodel1     4
 6     1 tt_2_p_   nonlinearmodel1     3
 7     1 fact_1_p_ nonlinearmodel2     1
 8     1 rara_2_p_ nonlinearmodel2     9
 9     2 fact_1_p_ model1              3
10     2 ra_2_p_   model1              6
# ... with 30 more rows

你可以試試

library(dplyr)
library(tidyr)
data %>%
  reshape2::melt(id = 'id') %>%
  separate(variable, c("model", "coef"), "p_") %>%
  mutate(model = paste0(model, "p_"))

   id     model            coef value
1   1 fact_1_p_          model1     1
2   2 fact_1_p_          model1     3
3   3 fact_1_p_          model1     4
4   4 fact_1_p_          model1     2
5   5 fact_1_p_          model1     5
6   1   ra_2_p_          model1     5
7   2   ra_2_p_          model1     6
8   3   ra_2_p_          model1     4
9   4   ra_2_p_          model1     2
10  5   ra_2_p_          model1     3
11  1   da_1_p_          model2     3
12  2   da_1_p_          model2     5
13  3   da_1_p_          model2     3
14  4   da_1_p_          model2     1
15  5   da_1_p_          model2     5
16  1   dd_2_p_          model2     4
17  2   dd_2_p_          model2     2
18  3   dd_2_p_          model2     5
19  4   dd_2_p_          model2     2
20  5   dd_2_p_          model2     1
21  1 fact_1_p_ nonlinearmodel1     4
22  2 fact_1_p_ nonlinearmodel1     2
23  3 fact_1_p_ nonlinearmodel1     5
24  4 fact_1_p_ nonlinearmodel1     2
25  5 fact_1_p_ nonlinearmodel1     2
26  1   tt_2_p_ nonlinearmodel1     3
27  2   tt_2_p_ nonlinearmodel1     6
28  3   tt_2_p_ nonlinearmodel1     5
29  4   tt_2_p_ nonlinearmodel1     3
30  5   tt_2_p_ nonlinearmodel1     1
31  1 fact_1_p_ nonlinearmodel2     1
32  2 fact_1_p_ nonlinearmodel2     2
33  3 fact_1_p_ nonlinearmodel2     6
34  4 fact_1_p_ nonlinearmodel2     2
35  5 fact_1_p_ nonlinearmodel2     4
36  1 rara_2_p_ nonlinearmodel2     9
37  2 rara_2_p_ nonlinearmodel2     5
38  3 rara_2_p_ nonlinearmodel2     5
39  4 rara_2_p_ nonlinearmodel2     2
40  5 rara_2_p_ nonlinearmodel2     1

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM