简体   繁体   中英

R: How to fill one column matrices of different dimensions in a LOOP?

I already asked a similar question , however the input data has different dimension and I don't get the bigger array filled with the smaller matrix or array. Here some basic example data showing my structure:

dfList <- list(data.frame(CNTRY = c("B", "C", "D"), Value=c(3,1,4)),
               data.frame(CNTRY = c("A", "B", "E"),Value=c(3,5,15)))
names(dfList) <- c("111.2000", "112.2000")

The input data is a list of >1000 dfs. Which I turned into a list of matrices with the first column as rownames. Here:

dfMATRIX <- lapply(dfList, function(x) {
  m <- as.matrix(x[,-1])
  rownames(m) <- x[,1]
  colnames(m) <- "Value"
  m
})

This list of matrices I tried to filled in an array as shown in my former question . Here:

loadandinstall("abind")
CNTRY <- c("A", "B", "C", "D", "E")
full_dflist <- array(dim=c(length(CNTRY),1,length(dfMATRIX)))
dimnames(full_dflist) <- list(CNTRY, "Value", names(dfMATRIX))

for(i in seq_along(dfMATRIX)){
  afill(full_dflist[, , i], local= TRUE ) <- dfMATRIX[[i]]   
}

which gives the error message:

Error in `afill<-.default`(`*tmp*`, local = TRUE, value = c(3, 1, 4)) : 
  does not make sense to have more dims in value than x

Any ideas? I also tried as in my former question to use acast and also array() instead of the dfMATRIX <- lapply... command. I would assume that the 2nd dimension of my full_dflist -array (sorry for the naming:)) is wrong, but I don't know how to write the input. I appreciate your ideas very much.

Edit2: Sorry I put the wrong output:) Here my new expected output:

$`111.2000`
  Value
A    NA
B     3
C     1
D     4
E    NA

$`112.2000`
  Value
A     3
B     5
C    NA
D    NA
E    15

This could be one solution using data.table :

library(data.table)
#create a big data.table with all the elements
biglist <- rbindlist(dfList)
#use lapply to operate on individual dfs
lapply(dfList, function(x) {
  #use the big data table to merge to each one of the element dfs
  temp <- merge(biglist[, list(CNTRY)], x, by='CNTRY', all.x=TRUE)
  #remove the duplicate values
  temp <- temp[!duplicated(temp), ] 
  #convert CNTRY to character and set the order on it
  temp[, CNTRY := as.character(CNTRY)]
  setorder(temp, 'CNTRY')
  temp
  })

Output:

$`111.2000`
   CNTRY Value
1:     A    NA
2:     B     3
3:     C     1
4:     D     4
5:     E    NA

$`112.2000`
   CNTRY Value
1:     A     3
2:     B     5
3:     C    NA
4:     D    NA
5:     E    15

EDIT

For your updated output you could do:

lapply(dfList, function(x) {
  temp <- merge(biglist[, list(CNTRY)], x, by='CNTRY', all.x=TRUE)
  temp <- temp[!duplicated(temp), ] 
  temp[, CNTRY := as.character(CNTRY)]
  setorder(temp, 'CNTRY')
  data.frame(Value=temp$Value, row.names=temp$CNTRY)
  })

$`111.2000`
  Value
A    NA
B     3
C     1
D     4
E    NA

$`112.2000`
  Value
A     3
B     5
C    NA
D    NA
E    15

But I would really suggest keeping the list with data.table elements rather than converting to data.frames so that you can have row.names.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM