简体   繁体   English

使用 pymongo 从集合中获取 Mongo 字段名称

[英]Get Mongo field names from collection using pymongo

I am trying to get fields names from the MongoDB using pymongo.我正在尝试使用 pymongo 从 MongoDB 获取字段名称。 Is there a way to do that?有没有办法做到这一点?

Mongo Collection Format: Mongo 集合格式:

    "_id" : ObjectId("5e7a773721ee63712e9d25a3"),
    "effective_date" : "2020-03-24",
    "data" : [
        {
            "Year" : 2020,
            "month" : 1,
            "Day" : 28,
            "views" : 4994,
            "clicks" : 3982
        },
        {
            "Year" : 2020,
            "month" : 1,
            "Day" : 17,
            "views" : 1987,
            "clicks" : 3561
        },
        .
        .
        .
       ]

Is there a way I can get field names: I want to get: _id, effective_date, data.Year, data.month, data.Day, data.views, data.clicks有没有办法获取字段名称:我想获取: _id, effective_date, data.Year, data.month, data.Day, data.views, data.clicks

This is what I have:这就是我所拥有的:

from datetime import datetime, timedelta, date
import pymongo
from pymongo import MongoClient
from pymongo.read_preferences import ReadPreference
from pprint import pprint
from bson.son import SON
from bson import json_util
from bson.json_util import dumps, loads
import re


client = pymongo.MongoClient(host='mongodb://00.00.00.0:00000')
db = client.collection
pprint(db)

def get_results(filters):

    col=db.results
    res = col.find()

    res = list(res)

    return dumps(res, indent=4)

Is there a way for me to get just the field names using pymongo?有没有办法让我使用 pymongo 只获取字段名称?

We are not really filtering or aggregating in the example;在示例中,我们并没有真正进行过滤或聚合; we are doing a big find() and then we want all the field names.我们正在做一个大的find()然后我们想要所有的字段名称。 There is no projection either.也没有投影。 So assuming that we are dragging over all the data anyway, let the client side do the work.所以假设我们无论如何都在拖拽所有数据,让客户端来做这项工作。 Here's something that will capture unique field names including through arrays and give you a count of each unique field name as well:这里有一些东西可以捕获唯一的字段名称,包括通过数组,并为您提供每个唯一字段名称的计数:

r = [
    {"_id":0, "A":"A", "data":[
            {"Y":2020,"day":3,"clicks":12},
            {"Y":2020,"day":4,"clicks":192}
            ]} ,
    {"_id":1, "B":{"foo":"bar"}, "data":[
            {"Y":2020,"day":3,"clicks":888,"corn":"dog"},
            {"Y":2020,"day":4,"clicks":999,"zing":"zap"}
            ]} ,
    {"_id":2, "B":{"foo":"bit"} },
    {"_id":3, "B":{"fin":"bar"} }
]
coll.insert(r)

fieldNames = {}

def addFldName(s):
    if s not in fieldNames:
        fieldNames[s] = 0
    fieldNames[s] += 1

def process(path, v):
    addFldName(path)
    if("dict" == v.__class__.__name__):
        walkMap(path, v)
    elif("list" == v.__class__.__name__):
        walkList(path, v)

def walkMap(path, doc):
    dot = "" if path is "" else "."
    for k, v in doc.iteritems():
        s = path + dot + k
        process(s, v)

def walkList(path, array):
    dot = "" if path is "" else "."
    for n in range(0,len(array)):
        s = path + dot + str(n)
        process(s, array[n])

for doc in coll.find():
    walkMap("", doc)

print(fieldNames)

{u'A': 1, u'data.1.clicks': 2, u'B': 3, u'data.0': 2, u'data.1': 2, u'data.0.Y': 2, u'data.1.zing': 1, u'data.0.day': 2, u'B.fin': 1, u'B.foo': 2, u'data.1.Y': 2, u'_id': 4, u'data': 2, u'data.0.corn': 1, u'data.0.clicks': 2, u'data.1.day': 2}

It's a little weird, but yes, data.0.clicks is unique and shows up in 2 docs.data.0.clicks ,但是是的, data.0.clicks是独一无二的,并且出现在 2 个文档中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM