简体   繁体   中英

Is there a solution with better performance than reshape from long to wide format conversion?

I have this small fragment of code to convert a data frame from long to wide.

library(reshape2)
mydata <- structure(list(issn = c("1980-4814", "1945-3116", "1681-4835", "1367-0751", "1516-6104", "1359-7566", "2319-0795", "1390-6615", "1808-8023", "1746-4269", "1852-2181", "0022-4596", "1808-2386", "0254-6051", "1981-3686", "1077-2618", "1809-3957", "2179-5746", "0147-6513", "1070-5503"), periodico = c("ABCustos (", "Journal of", "The Electr", "Logic Jour", "DIREITO, E", "REGIONAL &", "REVISTA FÓ", "UMBRAL: RE", "Segurança ", "Journal of", "Augm Domus", "Journal of", "BBR. Brazi", "Jinshu Rèc", "Revista Br", "IEEE Indus", "Revista SO", "Biota Amaz", "Ecotoxicol", "Internatio"), qualis = c("B4", "B3", "B2", "B2", "A1", "B5", "B5", "C ", "B5", "B3", "B3", "A1", "B4", "B3", "B5", "A2", "C ", "B3", "A2", "B1"), area = c(1L, 1L, 1L, 2L, 3L, 3L, 3L, 3L, 4L, 5L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 9L, 9L, 9L)), .Names = c("issn", "periodico", "qualis", "area"), row.names = c(1L, 501L, 1001L, 1501L, 2001L, 2501L, 3001L, 3501L, 4001L, 4501L, 5001L, 5501L, 6001L, 6501L, 7001L, 7501L, 8001L, 8501L, 9001L, 9501L), class = "data.frame")

reshape(mydata, direction = "wide", 
        idvar = c("issn", "periodico"), 
        timevar = "area")

The data 在此处输入图片说明

and result is

在此处输入图片说明

it's fine, just I want, but as the data frame grows above 2.000 records it gets very slow.

I have only 10 areas to be mapped to columns, but more than 10.000 issn's.

I'm looking for faster ways to achieve the same result.

Thanks

For reshaping problems, dcast from data.table is highly optimized and is very efficient and should be faster than any of the packages currently available

library(data.table)
dcast(setDT(mydata), issn+periodico~area, value.var = "qualis")

You can use dplyr and tidyr for this:

library(dplyr)
library(tidyr)
mydata %>% 
  mutate(area = paste('qualis',area,sep=".")) %>% 
  spread(area, qualis)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM