简体   繁体   English

rmongodb:如何通过在投影查询中指定来强制执行字段排序?

[英]rmongodb: how to enforce field ordering by specifying in the projection query?

I have some data I am trying to retrieve from mongodb into R using rmongodb package. 我有一些数据正在尝试使用rmongodb包从mongodb检索到R中。 At some point in time, the ordering of the fields in the stored documents changed. 在某个时间点,存储文档中字段的顺序发生了变化。

I am trying to force my projection query to keep the order of the fields projected fixed by explicitly specifying approach attempted in: SO question as follows: 我试图通过明确指定尝试的方法来强制投影查询保持投影字段的顺序固定: SO问题如下:

data <- mongo.find.all(mongo_conn, table,
                          fields = list('id1' = 1, 'id2' = 2,
                                        'time' = 3, 'latitude' = 4,
                                        'longitude' = 5, '_id' = 0))

I can't seem to find a good answer to this. 我似乎找不到一个好的答案。 It returns the fields in the order they are in the DB, which changed, as a list of course. 它以字段在数据库中的更改顺序返回它们的顺序,作为列表。

That means, it wreaks obvious havoc in what kind of loopy code I have to write to organize the returned results into a data frame like structure. 这意味着,在我必须编写什么样的循环代码以将返回的结果组织成数据帧之类的结构时,它造成了严重的破坏。

Any idea how I can get the fields in specified order and not what is in DB? 知道如何以指定的顺序获取字段,而不是数据库中的内容吗?

In the answer you link to it says 在答案中,您链接到它说

Simple answer is that you can't do this. 简单的答案是您不能这样做。

Also see the related mongodb ticket 另请参阅相关的mongodb票

However, to get your results in a data.frame -like structure, use mongolite , it's much easier to work with 但是, data.frame结果成为data.frame的结构,请使用mongolite ,使用mongolite要容易得多

Consider this example using the mtcars data 考虑使用mtcars数据的此示例

data("mtcars")

library(mongolite)  

mongo <- mongo(db = "test",
               collection = "mtcars",
               url = "mongodb://localhost")

## insert into database
mongo$insert(mtcars)
# Complete! Processed total of 32 rows.
# [1] TRUE

mongolite::find will automatically simplify the query results into a data.frame structure mongolite::find会自动将查询结果简化为data.frame结构

df_results <- mongo$find()
# Imported 32 records. Simplifying into dataframe...

head(df_results)
#                     mpg cyl disp  hp drat    wt  qsec vs am gear carb
# Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
# Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
# Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
# Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
# Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
# Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Or, using the aggregation framework 或者,使用聚合框架

mongo$aggregate(pipeline = '[{ "$project" : { "mpg" : 1, "wt" : 1, "_id" : 0}  },
                             { "$limit" : 5 }]')

# Imported 5 records. Simplifying into dataframe...
#     mpg    wt
# 1 21.0 2.620
# 2 21.0 2.875
# 3 22.8 2.320
# 4 21.4 3.215
# 5 18.7 3.440

And now for a bit of shameless self-promotion. 现在,进行一些无耻的自我推广。 I've been working on an extension to mongolite that returns a data.table object. 我一直在研究Mongolite的扩展,该扩展返回一个data.table对象。 The idea here is to increase the speed of returned objects, but only if the returned result set can be coerced using rbindlist . 这里的想法是提高返回对象的速度,但rbindlist是可以使用rbindlist强制返回的结果集。

The package is mongolitedt , and still in development. 该软件包为mongolitedt ,并且仍在开发中。

# devtools::install_github("SymbolixAU/mongolitedt")
library(mongolitedt)

bind_mongolitedt(mongo)

mongo$aggregatedt(pipeline = '[{ "$project" : { "mpg" : 1, "wt" : 1, "_id" : 0}  },
                             { "$limit" : 5 }]')

## now have a data.table object returned
#  Imported 5 records.
#     mpg    wt
# 1: 21.0 2.620
# 2: 21.0 2.875
# 3: 22.8 2.320
# 4: 21.4 3.215
# 5: 18.7 3.440

## clean up
rm(mongo); gc()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM