简体   繁体   English

在python中匹配正则表达式并返回键

[英]Match regex in python and return key

I have a nested dictionary and I have a trouble matching a regular expression with values in dictionary. 我有一个嵌套的字典,在将正则表达式与字典中的值匹配时遇到麻烦。 I need to iterate through values in dictionary and return a key where regex has matched in value. 我需要遍历字典中的值并返回正则表达式已匹配值的键。

I have nested dictionary like this: 我有这样的嵌套字典:

    user_info = { 'user1': {'name': 'Aby',
                    'surname': 'Clark',
                    'description': 'Hi contact me by phone +1 548 5455 55 
                     or facebook.com/aby.clark'},
          'user2': {'name': 'Marta',
                     'surname': 'Bishop',
                     'description': 'Nice to meet you text me'},
           'user3': {'name': 'Janice',
                     'surname': 'Valinise',
                     'description': 'You can contact me by phone +1 457 
                      555667'},
           'user4': {'name': 'Helen',
                     'surname': 'Bush',
                     'description': 'You can contact me by phone +1 778 
                      65422'},
           'user5': {'name': 'Janice',
                     'surname': 'Valinise',
                     'description': 'You can contact me by phone +1 457 
                      5342327 or email janval@yahoo.com'}}

So I need to iterate through values of dictionary with regex and find a match and return back a key where is match happened. 因此,我需要使用正则表达式遍历字典的值并找到匹配项,然后返回发生匹配项的键。

A first problem I have faced is extracting a values from nested dictionary, but I solved this through: 我遇到的第一个问题是从嵌套字典中提取值,但是我通过以下方法解决了这个问题:

   for key in user_info.keys():
       for values in user_info[key].values():
           print(values)

And this getting back a values from nested dictionary. 然后从嵌套字典中获取一个值。 So is there a way to iterate through this values with regex as it will find a match and return back a key where match is happened. 因此,有一种方法可以用正则表达式遍历此值,因为它将找到匹配项并返回发生匹配项的键。

I tried the following: 我尝试了以下方法:

 for key in user_info.keys():
     for values in user_info.[key].values():

         #this regex match the email
         email = re.compile(r'(^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$)'.format(pattern), re.IGNORECASE|re.MULTILINE) 
         match = re.match(email)

         if match is not None:
             print ("No values.")

      if found: 
         return match

Am I doing something wrong? 难道我做错了什么? I am wrestling with this question for a week... Could you please tell me what's wrong and give a tips how to solve this #!4fd... please. 我正在为这个问题努力一个星期……请您告诉我出什么问题了,并给出提示如何解决#!4fd ...问题。 Thank you! 谢谢!

PS And yep I didn't found the similar issue on stackoverflow and google. 附注:是的,我没有在stackoverflow和google上找到类似的问题。 I've tried. 我试过了。

You can try using search instead of the match function in the next way: 您可以尝试通过以下方式使用搜索而不是匹配功能:

for key in user_info.keys():
    for values in user_info[key].values():
        email = re.search(r'([a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+)+', values)
        if email != None:
            print(key)

This code will print all the keys with the matched inner value. 此代码将打印具有匹配内部值的所有键。

Notice that in the code you have tried you didn't use values at all. 注意,在您尝试过的代码中,您根本没有使用values

Looks like you want to extract the emails from the JSON values while also returning the matched key. 看起来您想从JSON值中提取电子邮件,同时还返回匹配的密钥。 Here are 2 solutions. 这里有2个解决方案。 The first one is similar to yours and the second one is generalized to any JSON with arbitrary levels. 第一个与您的相似,第二个被通用化为具有任意级别的任何JSON。

  1. Two for loops 两个for循环
import re

user_info = {
  "user1": {
    "name": "Aby",
    "surname": "Clark",
    "description": "Hi contact me by phone +1 548 5455 55or facebook.com/aby.clark"
  },
  "user2": {
    "name": "Marta",
    "surname": "Bishop",
    "description": "Nice to meet you text me"
  },
  "user3": {
    "name": "Janice",
    "surname": "Valinise",
    "description": "You can contact me by phone +1 457 555667"
  },
  "user4": {
    "name": "Helen",
    "surname": "Bush",
    "description": "You can contact me by phone +1 778 65422"
  },
  "user5": {
    "name": "Janice",
    "surname": "Valinise",
    "description": "You can contact me by phone +1 457 5342327 or email janval@yahoo.com",
  }
}

matches = []
for user, info in user_info.items():
    for key, value in info.items():
        emails = re.findall("([a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+)", value)
        if emails:
          matches.append((f'{user}.{key}', emails))

print(matches)
# -> [('user5.description', ['janval@yahoo.com'])]

  1. The recursive approach for arbitrary JSON 任意JSON的递归方法
import re

user_info = {
  "user1": {
    "name": "Aby",
    "surname": "Clark",
    "description": "Hi contact me by phone +1 548 5455 55or janval@yahoo.com",
    "friends": [
      {
        "name": "Aby",
        "surname": "Clark",
        "description": "Hi contact me by phone +1 548 5455 55or janval@yahoo.com",
      }
    ]
  }
}

def traverse(obj, keys = []):
  if isinstance(obj, str):
    emails = re.findall("([a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+)", obj)
    return [('.'.join(keys), emails)] if emails else []
  if isinstance(obj, dict):
    return [match for key, value in obj.items() for match in traverse(value, [*keys, key])]
  if isinstance(obj, list):
    return [match for i, value in enumerate(obj) for match in traverse(value, [*keys, str(i)])] 
  return []

print(traverse(user_info, []))
# -> [('user1.description', ['janval@yahoo.com']), ('user1.friends.0.description', ['janval@yahoo.com'])]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM