简体   繁体   English

如何从MongoDB实时查询数据?

[英]How can I query data from MongoDB in real-time?

I created a MongoDB database, and I'm sending data to it. 我创建了一个MongoDB数据库,并向它发送数据。 At the same time, I'm running a Python script to fetch data from that Database. 同时,我正在运行一个Python脚本来从该数据库中获取数据。 I would like my script to print the new entry to my console as soon as it's pushed to the DB, but I don't know how to accomplish this. 我希望我的脚本在将新条目推送到数据库后立即将其打印到控制台,但是我不知道该如何完成。

This is my current work, but I don't like it because each time it will print the whole data on the db, even though I only want the last entry/entries as soon as they are updated: 这是我目前的工作,但我不喜欢它,因为每次它都会在db上打印整个数据,即使我只希望在更新后立即输入最后一个条目即可。

from pymongo import MongoClient
import time
import random
from pprint import pprint

client = MongoClient(port=27017)

arr = []

db = client.one

mycol = client["coll"]



while True:
    cursor = db.mycol.find()
    for document in cursor:
        print(document['num'])
    time.sleep(2)    

How can I resolve this? 我该如何解决?

Mongo DB since version 3.6 supports a feature call "Change Streams". 从3.6版开始,Mongo DB支持功能调用“ Change Streams”。 In the documentation you will find this simple Python example among some others: 文档中,您可以找到以下简单的Python示例:

cursor = db.inventory.watch()
document = next(cursor)

If next() is supported on the cursor your should also be able to use it in loops, generators and even asyncio . 如果游标支持next() ,那么您还应该可以在循环,生成器甚至asyncio使用它。

There are a few ways to handle this, but the easiest might be to store an auto-incrementing "primaryKey" (or insert timestamp or whatever), and only print the results that occur after that key. 有几种方法可以解决此问题,但最简单的方法可能是存储自动递增的“ primaryKey”(或插入时间戳或其他内容),然后仅打印该键之后的结果。 Here is a quick example to demonstrate: 这是一个简单的示例来演示:

# we start at one...
highest_previous_primary_key = 1

while True:
    cursor = db.mycol.find()
    for document in cursor:

        # get the current primary key, and if it's greater than the previous one
        # we print the results and increment the variable to that value
        current_primary_key = document['primaryKey']
        if current_primary_key > highest_previous_primary_key:
            print(document['num'])
            highest_previous_primary_key = current_primary_key

    time.sleep(2)

This is perhaps the laziest approach to do. 这也许是最懒的做法。 But beyond this, you may try doing: 但除此之外,您可以尝试执行以下操作:

  1. Adjusting the query itself, so it only gets items > the primaryKey (imagine if you had a billion results and each time you fetched all results). 调整查询本身,使其仅获取项目> primaryKey(想象一下,如果您有十亿个结果,并且每次获取所有结果)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM