简体   繁体   中英

How to combine different columns of characters from the same data frame in R

I have a data frame in R as follows:

D = data.frame(countrycode = c(2, 2, 2, 3, 3, 3), 
           year = c(1980, 1991, 2013, 1980, 1991, 2013), 
           hello = c("A", "B", "C", "D", "E", "F"), 
           world = c("Z", "Y", "X", "NA", "Q", "NA"), 
           foo = c("Yes", "No", "NA", "NA", "Yes", "NA"))

I would like the columns hello , world and foo to be combined in a single column, indexed by countrycode and year , as follows:

output<-data.frame(countrycode=c(2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3),
    year=c(1980,1980,1980,1991,1991,1991,2013,2013,2013,1980,1980,1980,1991,1991,1991,2013,2013,2013),
    Combined=c("A","Z","Yes","B","Y","No","C","X","NA","D","NA","NA","E","Q","Yes","F","NA","NA"))

I have tried both cbind from standard R and gather from the tidyr package, and neither seem to work.

I think you are looking for the package reshape2. Try following code:

library(reshape2)

output<-melt(D,id.vars=c("countrycode","year"))
output<-output[order(output$countrycode,output$year),]

It reproduces your example. Two functions are extremely useful: melt and opposite one: dcast.

reshape2 and dplyr one-liner:

library(reshape2)
library(dplyr)
converted = melt(D,
  measure.vars=c("hello","world","foo"),
  value.name="Combined") %>%
    arrange(countrycode, year) %>% select(-variable)

> converted
   countrycode year Combined
1            2 1980        A
2            2 1980        Z
3            2 1980      Yes
4            2 1991        B
5            2 1991        Y
6            2 1991       No

etc. This also ends up with the same columns and column names as your sample output.

With tidyr and dplyr , this would look like

library(dplyr)
library(tidyr)

D %>% gather(var, Combined, hello:foo) %>% arrange(countrycode, year)
#    countrycode year   var Combined
# 1            2 1980 hello        A
# 2            2 1980 world        Z
# 3            2 1980   foo      Yes
# 4            2 1991 hello        B
# 5            2 1991 world        Y
# 6            2 1991   foo       No
# .            .  ...   ...      ...

I left the key column as you lose data without it, but if you really don't want it, tack on %>% select(-var) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM