简体   繁体   English

rmongodb-将两个数据帧合并到一个文档集中

[英]rmongodb - combine two data frames into one document in a collection

Question

Using R and rmongodb , how do I create a mongodb document from two data frames, the second of which will be an array element of the first? 使用Rrmongodb ,如何从两个数据帧创建一个mongodb文档,其中的第二个将是第一个的数组元素?

Data 数据

My first data.frame is always one row. 我的第一个data.frame总是一行。 eg 例如

df_1 <- data.frame(myVar1 = 1,
                   myVar2 = 2,
                   myVar3 = 3)

My second data.frame is always one or more rows eg 我的第二个data.frame总是一行或多行,例如

df_2 <- data.frame(arrVar1 = c(1,2),
                   arrVar2 = c(1,2))

Required Solution 所需的解决方案

my goal is to have a document in a collection that structured like: 我的目标是在集合中创建一个结构如下的文档:

# {
# "_id" : ObjectId("565a939aa30fff2d67bfd492"),
# "vars" : {
#   "myVar1" : 1.0000000000000000,
#   "myVar2" : 2.0000000000000000,
#   "myVar3" : 3.0000000000000000,
#   "myArr" : [
#        {
#            "arrVar1" : 1,
#            "arrVar2" : 1
#         },
#         {
#            "arrVar1" : 2,
#            "arrVar2" : 2
#         }
#     ]
#   }  
# }

How can I achieve this? 我该如何实现?


Edit 编辑

(removed all my attempts) (删除了我所有的尝试)

Thanks to Dmitriy for the answer and showing me what structure I needed to achieve. 感谢Dmitriy的回答,并向我展示了我需要实现的结构。

As such, I've benchmarked a few different ways of getting the solution. 因此,我已经对获取解决方案的几种不同方法进行了基准测试。

library(microbenchmark)

fun_1 <- function(df){
  list(myArr = unname(split(df, seq(nrow(df)))))  
}

fun_2 <- function(df){
  list('myArr' = Map(function(i, d) d[i, ], 
                     seq_len(nrow(df)), 
                     MoreArgs = list('d' = df)
  ))
}

fun_3 <- function(df){
  list(myArr = (lapply(as.list(1:dim(df)[1]), function(x) df[x[1],])))
}

microbenchmark(fun_1(df_2), fun_2(df_2), fun_3(df_2),  times = 1000)


Unit: microseconds
       expr     min       lq     mean   median       uq      max neval
fun_1(df_2) 162.135 176.7315 197.8129 187.7065 201.0385 1555.802  1000
fun_2(df_2)  84.770  92.2840 102.3595  96.3135 108.8165 1441.410  1000
fun_3(df_2)  85.052  93.8675 103.7496  97.9310 109.4090 1422.860  1000

There is nothing rmongodb special here. 这里没有rmongodb特别的地方。 As I wrote everywhere: rmongodb will convert unnamed lists into arrays and named lists into objects. 正如我到处写的:rmongodb会将未命名的列表转换为数组,并将命名的列表转换为对象。 So you just should to convert your second data.frame into correct list: 因此,您只应将第二个data.frame转换为正确的列表:

df2_transformed <-  list('myArr' = Map(function(i, df) df[i, ], 
                                   seq_len(nrow(df_2)), 
                                   MoreArgs = list('df' = df_2)
                                  ))
df1_df2_comb <- c(df_1, df2_transformed)
str(df1_df2_comb)
mongo.insert(mongo, paste0(db,".",coll), df1_df2_comb)

You can use Map , lapply , mapply - depends on your preference. 您可以使用Maplapplymapply取决于您的偏好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM