簡體   English   中英

通過多個變量變量和無變化時間重塑寬到長

[英]Reshaping wide to long with multiple varying variables and unbalaned time

我有兩組不同的變量:年度百分比和年份。 年度百分比從1999年開始到2012年結束,但是從1999年到2013年開始。

countrylabel annualpercentageshare.1999 year1990 year1991 year1992
1      Austria                         NA       NA       NA       NA
2      Belgium                         NA       NA       NA       NA
3     Bulgaria                   48.20000       NA       NA       NA
4      Estonia                         NA       NA       NA       NA
5       France                   47.52853       NA       NA       NA
6      Germany                         NA       NA       NA       NA

像這樣的東西。

我已經嘗試過這段代碼:

merge_data2 <- reshape(merge_data2, varying = list(2:ncol(merge_data2)), 
                       v.names = c("percentageshare", "Year"),
                       idvar = "countrylabel", direction = "long", times = 1990:2013)

但我收到此錯誤消息:

“reshapeLong中的錯誤(數據,idvar = idvar,timevar = timevar,變化=變化,:'length(變化)'必須全部匹配'length(times)'”

編輯:我想要一個這樣的數據幀:

countrylabel    time      annualpercentageshare        year
Austria          1990            NA                      NA
Austria          1991            NA                      NA
library(tidyr); library(dplyr)
df %>%
  gather(variable, value, -countrylabel) %>%
  separate("variable", into = c("stat", "time"), sep = -4) %>%
  spread(stat, value)

產量

   countrylabel time annualpercentageshare. year
1       Austria 1990                     NA   NA
2       Austria 1991                     NA   NA
3       Austria 1992                     NA   NA
4       Austria 1999                     NA   NA
5       Belgium 1990                     NA   NA
6       Belgium 1991                     NA   NA
7       Belgium 1992                     NA   NA
8       Belgium 1999                     NA   NA
9      Bulgaria 1990                     NA   NA
10     Bulgaria 1991                     NA   NA
11     Bulgaria 1992                     NA   NA
12     Bulgaria 1999               48.20000   NA
13      Estonia 1990                     NA   NA
14      Estonia 1991                     NA   NA
15      Estonia 1992                     NA   NA
16      Estonia 1999                     NA   NA
17       France 1990                     NA   NA
18       France 1991                     NA   NA
19       France 1992                     NA   NA
20       France 1999               47.52853   NA
21      Germany 1990                     NA   NA
22      Germany 1991                     NA   NA
23      Germany 1992                     NA   NA
24      Germany 1999                     NA   NA

reshape喜歡"." ,所以我們首先在year*變量中插入一個。

names(d) <- gsub("year", "year.", names(d))

現在,在我們reshape了缺少的列和order

d$annualpercentage.2002 <- NA
d$year.1999 <- NA
d <- d[c(1, order(names(d)[-1]) + 1)]

你的想法的作品通過定義不同的列排序varying中的列表:

res <- reshape(d, varying=list(2:5, 6:9), direction="long", idvar="countrylabel", 
               times=1999:2002, v.names=c("annualpercentage", "year"))
res
#                  countrylabel time annualpercentage        year
# Austria.1999          Austria 1999               NA          NA
# Belgium.1999          Belgium 1999               NA          NA
# Bulgaria.1999        Bulgaria 1999       -0.6806495          NA
# Estonia.1999          Estonia 1999               NA          NA
# France.1999            France 1999               NA          NA
# Germany.1999          Germany 1999               NA          NA
# Switzerland.1999  Switzerland 1999       -1.8497570          NA
# Austria.2000          Austria 2000       -0.6033900  0.14970015
# Belgium.2000          Belgium 2000               NA -0.49201756
# Bulgaria.2000        Bulgaria 2000        0.8263925 -0.36320990
# Estonia.2000          Estonia 2000               NA -2.51032544
# France.2000            France 2000               NA  0.57800624
# Germany.2000          Germany 2000               NA -0.52295712
# Switzerland.2000  Switzerland 2000        0.2783076  0.25616728
# Austria.2001          Austria 2001       -2.6962484 -0.15375642
# Belgium.2001          Belgium 2001        1.3088577  0.72528621
# Bulgaria.2001        Bulgaria 2001               NA          NA
# Estonia.2001          Estonia 2001               NA -0.05563662
# France.2001            France 2001        0.2224629  0.74205086
# Germany.2001          Germany 2001               NA -0.01185349
# Switzerland.2001  Switzerland 2001        0.8354322 -1.40826638
# Austria.2002          Austria 2002               NA          NA
# Belgium.2002          Belgium 2002               NA  1.60874778
# Bulgaria.2002        Bulgaria 2002               NA          NA
# Estonia.2002          Estonia 2002               NA  0.55866704
# France.2002            France 2002               NA -1.59866472
# Germany.2002          Germany 2002               NA -0.11217415
# Switzerland.2002  Switzerland 2002               NA          NA

數據

d <- structure(list(countrylabel = c("Austria", "Belgium", "Bulgaria", 
"Estonia", "France", "Germany", "Switzerland"), annualpercentage.1999 = c(NA, 
-2.58060150400384, -0.0623757258909573, 0.267776001395166, NA, 
NA, 0.048219924249952), annualpercentage.2000 = c(NA, -0.249416955035044, 
1.3525450891501, 1.04446768824697, NA, -0.0582347596434839, -0.891400228849837
), annualpercentage.2001 = c(1.82469277697851, NA, NA, 1.04231605324821, 
NA, -0.900145118946308, -1.19320727433597), year2000 = c(0.633712375393134, 
NA, 1.24760861316098, -0.092964787061478, -0.59403260962332, 
NA, -0.650348234181285), year2001 = c(0.587318286831079, NA, 
NA, 0.348890470222513, NA, NA, NA), year2002 = c(0.0645316087966406, 
-0.279456557428068, NA, NA, -0.0627400036074545, 1.30419117694731, 
-0.484654596062051)), row.names = c(NA, -7L), class = "data.frame")

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM