在python中迭代mongo文档

Question

Please take a look at the following code: 请看下面的代码：

collection = db_name.get_db().collection_name
print collection
# prints .. Collection(Database(Connection('localhost', 27017), u'db_name'), u'colelction_name')


for key in some_dict.keys():
        query = {"p_id":key}
        document = collection.find(query)
        print document
        # gives <pymongo.cursor.Cursor object at 0x7f13f3049b10>

Now I want to retreive this document.. and fetch the data. 现在我想要检索这个文件..并获取数据。 but if i do: 但如果我这样做：

       for d in document:
            print d

I get the following error 我收到以下错误

    File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 703, in next
   if len(self.__data) or self._refresh():
 File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 666, in _refresh
self.__uuid_subtype))
  File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 628, in __send_message
self.__tz_aware)
  File "/usr/local/lib/python2.7/dist-packages/pymongo/helpers.py", line 101, in _unpack_response
error_object["$err"])

What am i doing wrong? 我究竟做错了什么？ Thanks 谢谢

Answer 1

You want to retrieve "this document" - then use the official method for fetching one document matching your criteria: find_one(): 您想要检索“此文档” - 然后使用官方方法获取符合条件的一个文档：find_one（）：

http://api.mongodb.org/python/2.2/api/pymongo/collection.html http://api.mongodb.org/python/2.2/api/pymongo/collection.html

Reading basic API documentation is your friend. 阅读基本的API文档是你的朋友。

Answer 2

Also, note that pymongo has several constructor options availble to a cursor : 另请注意，pymongo有几个可用于游标的构造函数选项：

https://github.com/mongodb/mongo-python-driver/blob/master/pymongo/cursor.py https://github.com/mongodb/mongo-python-driver/blob/master/pymongo/cursor.py

Some of these options (await_data, exhaust, for example) will make a big difference in terms of speeding cursor iteration. 其中一些选项（例如await_data，exhaust）会在加速光标迭代方面产生很大的不同。

Depending on the length and complexity of your query, you can also process data in a separate thread, so that you run through the cursor as fast as possible, firing of asynchronous tasks along the way. 根据查询的长度和复杂程度，您还可以在单独的线程中处理数据，以便尽可能快地运行游标，同时触发异步任务。

I've found that running a single thread for exhausting a mongo cursor, with separate threads for processing data from it can increase cursor throughput drastically, regardless of the amount of data the cursor brings down with it. 我发现运行单个线程来耗尽mongo游标，使用单独的线程处理来自它的数据可以大大提高游标吞吐量，无论游标带来的数据量如何。

Answer 3

I don't understand why you are trying to get a batch of documents using this approach, when the API and mongo provides queries to suit this? 当API和mongo提供适合此问题的查询时，我不明白为什么要使用这种方法获取一批文档？

If some_dict.keys() returns a list of id's, and you want to retrieve the documents matching those id's, why not use a proper "in" query? 如果some_dict.keys（）返回id的列表，并且您想要检索与这些id匹配的文档，为什么不使用正确的“in”查询？

docs = collection.find({'p_id': {'$in':some_dict.keys()}})
print docs
# <pymongo.cursor.Cursor object at 0x10112dfd0>
print [d for d in docs]
# [{DOC1}, {DOC2}, {...}]

As @ich recommended, the pymongo api docs explain everything, as well as reading the mongodb query language. 正如@ich推荐的那样，pymongo api文档解释了所有内容，以及阅读mongodb查询语言。

If this is not what you were looking for, then your intent was not clear in your question. 如果这不是您想要的，那么您的意图在您的问题中并不明确。 It does simply look like you want to get a batch of documents matching a list of id's 它看起来只是想要获得一组与id列表匹配的文档

在python中迭代mongo文档

问题描述

3 个解决方案

解决方案1
1 已采纳 2012-05-02 04:10:05

解决方案2
1 2012-09-14 21:55:04

解决方案3
0 2012-05-02 04:23:34

在python中迭代mongo文档

问题描述

3 个解决方案

解决方案1 1 已采纳 2012-05-02 04:10:05

解决方案2 1 2012-09-14 21:55:04

解决方案3 0 2012-05-02 04:23:34

解决方案1
1 已采纳 2012-05-02 04:10:05

解决方案2
1 2012-09-14 21:55:04

解决方案3
0 2012-05-02 04:23:34