简体   繁体   中英

rmongodb: how to enforce field ordering by specifying in the projection query?

I have some data I am trying to retrieve from mongodb into R using rmongodb package. At some point in time, the ordering of the fields in the stored documents changed.

I am trying to force my projection query to keep the order of the fields projected fixed by explicitly specifying approach attempted in: SO question as follows:

data <- mongo.find.all(mongo_conn, table,
                          fields = list('id1' = 1, 'id2' = 2,
                                        'time' = 3, 'latitude' = 4,
                                        'longitude' = 5, '_id' = 0))

I can't seem to find a good answer to this. It returns the fields in the order they are in the DB, which changed, as a list of course.

That means, it wreaks obvious havoc in what kind of loopy code I have to write to organize the returned results into a data frame like structure.

Any idea how I can get the fields in specified order and not what is in DB?

In the answer you link to it says

Simple answer is that you can't do this.

Also see the related mongodb ticket

However, to get your results in a data.frame -like structure, use mongolite , it's much easier to work with

Consider this example using the mtcars data

data("mtcars")

library(mongolite)  

mongo <- mongo(db = "test",
               collection = "mtcars",
               url = "mongodb://localhost")

## insert into database
mongo$insert(mtcars)
# Complete! Processed total of 32 rows.
# [1] TRUE

mongolite::find will automatically simplify the query results into a data.frame structure

df_results <- mongo$find()
# Imported 32 records. Simplifying into dataframe...

head(df_results)
#                     mpg cyl disp  hp drat    wt  qsec vs am gear carb
# Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
# Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
# Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
# Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
# Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
# Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Or, using the aggregation framework

mongo$aggregate(pipeline = '[{ "$project" : { "mpg" : 1, "wt" : 1, "_id" : 0}  },
                             { "$limit" : 5 }]')

# Imported 5 records. Simplifying into dataframe...
#     mpg    wt
# 1 21.0 2.620
# 2 21.0 2.875
# 3 22.8 2.320
# 4 21.4 3.215
# 5 18.7 3.440

And now for a bit of shameless self-promotion. I've been working on an extension to mongolite that returns a data.table object. The idea here is to increase the speed of returned objects, but only if the returned result set can be coerced using rbindlist .

The package is mongolitedt , and still in development.

# devtools::install_github("SymbolixAU/mongolitedt")
library(mongolitedt)

bind_mongolitedt(mongo)

mongo$aggregatedt(pipeline = '[{ "$project" : { "mpg" : 1, "wt" : 1, "_id" : 0}  },
                             { "$limit" : 5 }]')

## now have a data.table object returned
#  Imported 5 records.
#     mpg    wt
# 1: 21.0 2.620
# 2: 21.0 2.875
# 3: 22.8 2.320
# 4: 21.4 3.215
# 5: 18.7 3.440

## clean up
rm(mongo); gc()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM