简体   繁体   English

Mongo DB跟踪DDL更改

[英]Mongo DB track DDL changes

I am new to Mongo DB.I came from RDBMS/MPP/ETL background and most of the Data stores I used have the metadata about the objects(tables view etc).My doubt is specific to Mongo DB .Does it have any data dictionaries like Oracle user_tables or any other meta information about collections last DDL updated time since MongoDB is a schema less DB, application can change the insert data without schema changes .So finding any structure change before running ETL jobs is important when there is MongoDb involved .I searched for dictionaries or any API information which is tracking the DDL change and found nothing.Can anyone guide me to the links or information related to this.If there is no options like this is there any better best practises to follow to avoid these kind of schema evolution 我是Mongo DB的新手,来自RDBMS / MPP / ETL背景,我使用的大多数数据存储都包含有关对象的元数据(表视图等)。我的疑问是Mongo DB特有的,它是否有任何数据字典像Oracle user_tables或有关集合上次DDL更新时间的任何其他有关集合的元信息,因为MongoDB是少模式数据库,应用程序可以更改插入数据而不更改模式。因此,当涉及MongoDb时,在运行ETL作业之前找到任何结构更改非常重要。搜索字典或跟踪DDL更改的任何API信息,却一无所获。任何人都可以引导我找到与此相关的链接或信息。如果没有这样的选择,那么可以遵循一些更好的最佳实践来避免这种情况图式演变

Thanks Anoop R 感谢Anoop R

One amongst the advantage of using Mongodb is its schema less structure of storing documents. 使用Mongodb的优点之一是其较少架构的存储文档结构。 Now unlike RDBMS table dictionaries, the schema lives in the application layer for MongoDb users. 现在,与RDBMS表字典不同,该模式位于MongoDb用户的应用程序层中。 That gives the flexibility to application to design/change schema whenever without waiting on any alter statement dependencies. 这使应用程序可以随时灵活地设计/更改模式,而无需等待任何alter statement依赖项。

Having said that Mongodb 3.2 introduced schema validation and 3.4 enriched it. 话虽如此,Mongodb 3.2引入了模式验证,而3.4则丰富了模式验证。 You can learn more about the validation here Mongodb document validation . 您可以在此处了解有关验证的更多信息Mongodb文档验证 Validation rules are specified on a per-collection basis using the validator option, which takes a document that specifies the validation rules or expressions. 使用validator选项在每个集合的基础上指定验证规则,该选项采用一个指定验证规则或表达式的文档。

A point to note about schema validation is not to track the ddl changes but to build an agreed upon definition so to speak. 关于模式验证要注意的一点不是跟踪ddl更改,而是建立可以达成共识的定义。

I got a solution which is not actually I am trying for But I think we can manage using that .` 我得到的解决方案并不是我真正想要的,但是我认为我们可以使用该解决方案。

default checklist for data types 数据类型的默认清单

key_type_default_count = {
    int: 0,
    float: 0,
    str: 0,
    bool: 0,
    dict: 0,
    list: 0,
    set: 0,
    tuple: 0,
    None: 0,
    object: 0,
    unicode: 0,
    "other": 0,
}

custom code to get the mongo connection 自定义代码以获取mongo连接

client = create_mongo_con(v_env,v_con_name)
print client

db = client[v_db_name]
collection = db[v_collection]

main code 主要代码

key_type_count = defaultdict(lambda: dict(key_type_default_count))


mongo_collection_docs = collection.find({},{"_id":0}).limit(30)
#print mongo_collection_docs'
print type(mongo_collection_docs)

for doc in mongo_collection_docs:

    for key, value in doc.items():
        print ' my key '+str(key)
        print 'my value is '+str(value)
        print ' my value type '
        print type(value)
        if type(value) in key_type_count[key].keys():
            key_type_count[key][type(value)] += 1
        else:
            key_type_count[key]["other"] += 1
    total_docs += 1`

You can refer more about this https://github.com/nimeshkverma/mongo_schema from where I got the idea but that code was not working for .I edited some of the part and now I am able to generate a pretty output like this 您可以从https://github.com/nimeshkverma/mongo_schema那里获得更多信息,我从那里得到了这个主意,但是该代码不适用于我。我编辑了部分内容,现在我可以生成一个漂亮的输出 在此处输入图片说明

But now I am facing one issue with all string fields are detected as unicode.I need to figure this out will post If we got a solution.If anybody faced same issue with str and unicode in python please comment 但是现在我面临一个问题,所有字符串字段都被检测为unicode。我需要弄清楚这个问题将发布如果我们有解决方案。如果有人在python中遇到str和unicode的相同问题,请评论

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM