简体   繁体   中英

R given a list of dataframes, how to add a new column to all rows in dataframes

I have a case like following: there is a list of dataframes

class(cc.purc$items) => "list"

length(cc.purc$items) => 970

class(cc.purc$items[[1]]) => "data.frame"

head(cc.purc$items, 2)

[[1]]

    barcode quantity price amount grams litres   
1       abc        1  1.00   1.00    NA     NA           
2       xyz        1  1.29   1.29    NA     NA          

[[2]]

   barcode quantity price amount grams litres 
1     abc2        1   5.5    5.5    NA     NA              
2     xyz2       -1  19.5  -19.5    NA     NA         

cc.purc has a field called "transaction_id" one for each dataframe in the "items" list.

head(cc.purc$transaction_id, 2) => "62740" "62741"

I want to print all rows in all dataframes contained in list and add corresponding transaction_id as additional column to all rows.

Ex: want following

  barcode quantity price amount grams litres    tran_id
1     abc        1  1.00   1.00    NA     NA      62740          
2     xyz        1  1.29   1.29    NA     NA      62740       
3    abc2        1   5.5    5.5    NA     NA      62741             
4    xyz2       -1  19.5  -19.5    NA     NA      62741   

How to achieve this? Please help.

To get all rows from all DF in list "items" I can do following:

do.call("rbind", cc.purc$items)

But how to add corresponding column (transaction_id) to all relevant rows is what I am not able to figure out?

You can use Map() to loop over the data and the transaction IDs simultaneously.

Example data:

test <- list(data = list(data.frame(var1 = rnorm(4),
                                    var2 = runif(4)),
                         data.frame(var1 = rnorm(4),
                                    var2 = runif(4)),
                         data.frame(var1 = rnorm(4),
                                    var2 = runif(4))),
             tran_id = c(1:3))

# add new column to every dataframe

test$data <- Map(function(x, y){
  x$tran_id <- y
  return(x)
}, test$data, test$tran_id)

Result:

> test
$data
$data[[1]]
         var1       var2 tran_id
1  0.99943735 0.57436983       1
2 -0.04483769 0.29832753       1
3  1.89678549 0.81138668       1
4 -0.58839397 0.07071112       1

$data[[2]]
          var1       var2 tran_id
1 -0.018843434 0.84813495       2
2 -0.258920304 0.09818365       2
3 -0.009920782 0.07873543       2
4  0.833070609 0.47808518       2

$data[[3]]
         var1      var2 tran_id
1  1.21224941 0.3587937       3
2 -0.65107256 0.9727788       3
3  1.54107062 0.8444594       3
4 -0.09976177 0.6034762       3


$tran_id
[1] 1 2 3

Bind the data together:

> do.call("rbind", test$data)
           var1       var2 tran_id
1   0.999437345 0.57436983       1
2  -0.044837689 0.29832753       1
3   1.896785487 0.81138668       1
4  -0.588393971 0.07071112       1
5  -0.018843434 0.84813495       2
6  -0.258920304 0.09818365       2
7  -0.009920782 0.07873543       2
8   0.833070609 0.47808518       2
9   1.212249412 0.35879366       3
10 -0.651072562 0.97277883       3
11  1.541070621 0.84445938       3
12 -0.099761769 0.60347619       3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM