[英]How to combine different columns of characters from the same data frame in R
我在R中有一個數據框,如下所示:
D = data.frame(countrycode = c(2, 2, 2, 3, 3, 3),
year = c(1980, 1991, 2013, 1980, 1991, 2013),
hello = c("A", "B", "C", "D", "E", "F"),
world = c("Z", "Y", "X", "NA", "Q", "NA"),
foo = c("Yes", "No", "NA", "NA", "Yes", "NA"))
我希望將hello
, world
和foo
列合並到一個由countrycode
和year
索引的列中,如下所示:
output<-data.frame(countrycode=c(2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3),
year=c(1980,1980,1980,1991,1991,1991,2013,2013,2013,1980,1980,1980,1991,1991,1991,2013,2013,2013),
Combined=c("A","Z","Yes","B","Y","No","C","X","NA","D","NA","NA","E","Q","Yes","F","NA","NA"))
我已經嘗試cbind
標准R中cbind
並從tidyr
包中gather
tidyr
,但似乎都沒有用。
我認為您正在尋找包裝reshape2。 嘗試以下代碼:
library(reshape2)
output<-melt(D,id.vars=c("countrycode","year"))
output<-output[order(output$countrycode,output$year),]
它重現了您的示例。 兩種功能非常有用:融化和相反的一種:dcast。
reshape2
和dplyr
一線:
library(reshape2)
library(dplyr)
converted = melt(D,
measure.vars=c("hello","world","foo"),
value.name="Combined") %>%
arrange(countrycode, year) %>% select(-variable)
> converted
countrycode year Combined
1 2 1980 A
2 2 1980 Z
3 2 1980 Yes
4 2 1991 B
5 2 1991 Y
6 2 1991 No
等等。最后,它們的列和列名稱也與示例輸出相同。
使用tidyr
和dplyr
,這看起來像
library(dplyr)
library(tidyr)
D %>% gather(var, Combined, hello:foo) %>% arrange(countrycode, year)
# countrycode year var Combined
# 1 2 1980 hello A
# 2 2 1980 world Z
# 3 2 1980 foo Yes
# 4 2 1991 hello B
# 5 2 1991 world Y
# 6 2 1991 foo No
# . . ... ... ...
當您丟失沒有它的數據時,我離開了鍵列,但是如果您真的不想要它,請%>% select(-var)
。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.