简体   繁体   中英

Creating a new data table for each row of an existing data table R while avoiding memory vector issue

Suppose I have two data tables:

library(data.table)
A=data.table(w=1:3,d=5:7)
B=data.table(K=2:4,m=9:11)


> A
   w d
1: 1 5
2: 2 6
3: 3 7
> B
   K  m
1: 2  9
2: 3 10
3: 4 11

I want to do the following expansion, where I have a new B for each row of A:

C=A[,B[],by=names(A)]

   w d K  m
1: 1 5 2  9
2: 1 5 3 10
3: 1 5 4 11
4: 2 6 2  9
5: 2 6 3 10
6: 2 6 4 11
7: 3 7 2  9
8: 3 7 3 10
9: 3 7 4 11

However, when I do it with my real data, I get this error:

Error in `[.data.table`(A, , B[], by = names(A)) : 
  negative length vectors are not allowed

It turns out this is a memory error. However, I think there should be a way to do this without loops, memory is not an issue on my server up to 50gb of ram, which the following data table would certainly be less than.

Does anyone know an efficient way to do this?

A hacky way to handle this might be to add an identical helper column to each table and then to allow cartesian joins:

library(data.table)
A = data.table(w = 1:3, d = 5:7)
B = data.table(K = 2:4, m = 9:11)

A[, j := 1]
B[, j := 1]

C = A[B, on = 'j', allow.cartesian = T]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM