简体   繁体   English

REGEX在pymongo中搜索

[英]REGEX Searching in pymongo

I am attempting to create a search in pymongo using REGEX. 我正在尝试使用REGEX在pymongo中创建搜索。 After the match, I want the data to be appended to a list in the module. 匹配之后,我希望将数据附加到模块中的列表中。 I thought that I had everything set, but no matter what I set for the REGEX it returns 0 results. 我以为我已经设置了所有东西,但是无论我为REGEX设置什么,它都会返回0个结果。 The code is below: 代码如下:

REGEX = '.*\.com'

def myModule(self, data)
    #after importing everything and setting up the collection function in the DB I call the following:
    cursor = collection.find({'multiple.layers.of.data' : REGEX})
    data = []
    for x in cursor:
        matches.append(x)
    return matches

This is but one module of three I am using to filter through a huge amount of json files that have been stored in a mongodb. 这只是我使用的三个模块中的一个,以过滤掉存储在mongodb中的大量json文件。 However, no matter how many times I change this formatting such as /.*.com/ to declare in the operation or using the $regex in mongo...it never finds my data and appends it in the list. 但是,无论我更改此格式(例如/.*.com/)以在操作中进行声明还是在mongo中使用$ regex……都永远找不到我的数据并将其附加在列表中。

EDIT: Adding in the full code along with what I am trying to identify: 编辑:添加完整的代码以及我试图识别的内容:

RegEx = '.*\.com' #Or RegEx = re.compile('.*\.com')

def filterData(self, data):
       db = self.client[self.dbName]
       collection = db[self.collectionName]
       cursor = collection.find({'data.item11.sub.level3': {'$regex': RegEx}})
       data = []
       for x in cursor:
           data.append(x)
       return data

I am attempting to parse through JSON data in a mongodb. 我正在尝试通过mongodb中的JSON数据进行解析。 The data is structured like so: 数据的结构如下:

"data": {
    "0": {
        "item1": "something",
        "item2": 0,
        "item3": 000,
        "item4": 000000000,
        "item5": 000000000,
        "item6": "0000",
        "item7": 00,
        "item8": "0000",
        "item9": 00,
        "item10": "useful",
        "item11": {
            "0000": {
                "sub": {
                    "level": "letter",
                    "level1": 0000,
                    "level2": 0000000000,
                    "level3": "domain.com"
                },
                "more_data": "words"
            }
        }
    }

UPDATE: After further testing it appears as though I need to include all of the layers in the search. 更新:经过进一步测试,似乎我需要在搜索中包括所有图层。 Thus, it should look like 因此,它看起来应该像

collection.find({'data.0.item11.0000.sub.level3': {'$regex': RegEx}}) . collection.find({'data.0.item11.0000.sub.level3': {'$regex': RegEx}})

However, the "0" can be 1 - 50 and the "0000" is randomly generated. 但是,“ 0”可以是1-50,而“ 0000”是随机生成的。 Is there a way to set these to index's as variables so that it will step into it no matter what the value? 有没有一种方法可以将这些变量设置为索引的变量,以便无论值如何都可以将其插入其中? It will always be a number value. 它将始终是一个数字值。

Well, you need to tell mongodb the string should be treated as a regular expression, using the $regex operator: 好吧,您需要使用$regex运算符告诉mongodb字符串应被视为$regex

cursor = collection.find({'multiple.layers.of.data' : {'$regex': REGEX}})

I think simply replacing REGEX = '.*\\.com' with import re; REGEX = re.compile('.*\\.com') 我认为只需将REGEX = '.*\\.com'替换为import re; REGEX = re.compile('.*\\.com') import re; REGEX = re.compile('.*\\.com') might also work, but I'm not sure (would rely on a specific handling in the pymongo driver). import re; REGEX = re.compile('.*\\.com')也可以工作,但是我不确定(将依赖pymongo驱动程序中的特定处理)。


EDIT: 编辑:

Regarding the wildcard part of the question: The answer is no. 关于问题的通配符部分:答案是否定的。

In a nutshell, values that unknown should never be assigned as keys because it makes querying very inefficient. 简而言之,永远不要将未知值分配为键,因为这会使查询效率非常低。 There are no 'wild card' queries. 没有“通配符”查询。

It is better to restructure the database such that values that are unknown are not keys 最好重组数据库,以使未知值不是键

See: 看到:

MongoDB wildcard in the key of a query 查询键中的MongoDB通配符

http://groups.google.com/group/mongodb-user/browse_thread/thread/32b00d38d50bd858 http://groups.google.com/group/mongodb-user/browse_thread/thread/32b00d38d50bd858

https://groups.google.com/forum/#!topic/mongodb-user/TnAQMe-5ZGs https://groups.google.com/forum/#!topic/mongodb-user/TnAQMe-5ZGs

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM