简体   繁体   English

在python中迭代mongo文档

[英]iterating over mongo document in python

Please take a look at the following code: 请看下面的代码:

collection = db_name.get_db().collection_name
print collection
# prints .. Collection(Database(Connection('localhost', 27017), u'db_name'), u'colelction_name')


for key in some_dict.keys():
        query = {"p_id":key}
        document = collection.find(query)
        print document
        # gives <pymongo.cursor.Cursor object at 0x7f13f3049b10>

Now I want to retreive this document.. and fetch the data. 现在我想要检索这个文件..并获取数据。 but if i do: 但如果我这样做:

       for d in document:
            print d

I get the following error 我收到以下错误

    File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 703, in next
   if len(self.__data) or self._refresh():
 File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 666, in _refresh
self.__uuid_subtype))
  File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 628, in __send_message
self.__tz_aware)
  File "/usr/local/lib/python2.7/dist-packages/pymongo/helpers.py", line 101, in _unpack_response
error_object["$err"])

What am i doing wrong? 我究竟做错了什么? Thanks 谢谢

You want to retrieve "this document" - then use the official method for fetching one document matching your criteria: find_one(): 您想要检索“此文档” - 然后使用官方方法获取符合条件的一个文档:find_one():

http://api.mongodb.org/python/2.2/api/pymongo/collection.html http://api.mongodb.org/python/2.2/api/pymongo/collection.html

Reading basic API documentation is your friend. 阅读基本的API文档是你的朋友。

Also, note that pymongo has several constructor options availble to a cursor : 另请注意,pymongo有几个可用于游标的构造函数选项:

https://github.com/mongodb/mongo-python-driver/blob/master/pymongo/cursor.py https://github.com/mongodb/mongo-python-driver/blob/master/pymongo/cursor.py

Some of these options (await_data, exhaust, for example) will make a big difference in terms of speeding cursor iteration. 其中一些选项(例如await_data,exhaust)会在加速光标迭代方面产生很大的不同。

Depending on the length and complexity of your query, you can also process data in a separate thread, so that you run through the cursor as fast as possible, firing of asynchronous tasks along the way. 根据查询的长度和复杂程度,您还可以在单​​独的线程中处理数据,以便尽可能快地运行游标,同时触发异步任务。

I've found that running a single thread for exhausting a mongo cursor, with separate threads for processing data from it can increase cursor throughput drastically, regardless of the amount of data the cursor brings down with it. 我发现运行单个线程来耗尽mongo游标,使用单独的线程处理来自它的数据可以大大提高游标吞吐量,无论游标带来的数据量如何。

I don't understand why you are trying to get a batch of documents using this approach, when the API and mongo provides queries to suit this? 当API和mongo提供适合此问题的查询时,我不明白为什么要使用这种方法获取一批文档?

If some_dict.keys() returns a list of id's, and you want to retrieve the documents matching those id's, why not use a proper "in" query? 如果some_dict.keys()返回id的列表,并且您想要检索与这些id匹配的文档,为什么不使用正确的“in”查询?

docs = collection.find({'p_id': {'$in':some_dict.keys()}})
print docs
# <pymongo.cursor.Cursor object at 0x10112dfd0>
print [d for d in docs]
# [{DOC1}, {DOC2}, {...}]

As @ich recommended, the pymongo api docs explain everything, as well as reading the mongodb query language. 正如@ich推荐的那样,pymongo api文档解释了所有内容,以及阅读mongodb查询语言。

If this is not what you were looking for, then your intent was not clear in your question. 如果这不是您想要的,那么您的意图在您的问题中并不明确。 It does simply look like you want to get a batch of documents matching a list of id's 它看起来只是想要获得一组与id列表匹配的文档

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM