[英]Match regex in python and return key
I have a nested dictionary and I have a trouble matching a regular expression with values in dictionary. 我有一个嵌套的字典,在将正则表达式与字典中的值匹配时遇到麻烦。 I need to iterate through values in dictionary and return a key where regex has matched in value. 我需要遍历字典中的值并返回正则表达式已匹配值的键。
I have nested dictionary like this: 我有这样的嵌套字典:
user_info = { 'user1': {'name': 'Aby',
'surname': 'Clark',
'description': 'Hi contact me by phone +1 548 5455 55
or facebook.com/aby.clark'},
'user2': {'name': 'Marta',
'surname': 'Bishop',
'description': 'Nice to meet you text me'},
'user3': {'name': 'Janice',
'surname': 'Valinise',
'description': 'You can contact me by phone +1 457
555667'},
'user4': {'name': 'Helen',
'surname': 'Bush',
'description': 'You can contact me by phone +1 778
65422'},
'user5': {'name': 'Janice',
'surname': 'Valinise',
'description': 'You can contact me by phone +1 457
5342327 or email janval@yahoo.com'}}
So I need to iterate through values of dictionary with regex and find a match and return back a key where is match happened. 因此,我需要使用正则表达式遍历字典的值并找到匹配项,然后返回发生匹配项的键。
A first problem I have faced is extracting a values from nested dictionary, but I solved this through: 我遇到的第一个问题是从嵌套字典中提取值,但是我通过以下方法解决了这个问题:
for key in user_info.keys():
for values in user_info[key].values():
print(values)
And this getting back a values from nested dictionary. 然后从嵌套字典中获取一个值。 So is there a way to iterate through this values with regex as it will find a match and return back a key where match is happened. 因此,有一种方法可以用正则表达式遍历此值,因为它将找到匹配项并返回发生匹配项的键。
I tried the following: 我尝试了以下方法:
for key in user_info.keys():
for values in user_info.[key].values():
#this regex match the email
email = re.compile(r'(^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$)'.format(pattern), re.IGNORECASE|re.MULTILINE)
match = re.match(email)
if match is not None:
print ("No values.")
if found:
return match
Am I doing something wrong? 难道我做错了什么? I am wrestling with this question for a week... Could you please tell me what's wrong and give a tips how to solve this #!4fd... please. 我正在为这个问题努力一个星期……请您告诉我出什么问题了,并给出提示如何解决#!4fd ...问题。 Thank you! 谢谢!
PS And yep I didn't found the similar issue on stackoverflow and google. 附注:是的,我没有在stackoverflow和google上找到类似的问题。 I've tried. 我试过了。
You can try using search instead of the match function in the next way: 您可以尝试通过以下方式使用搜索而不是匹配功能:
for key in user_info.keys():
for values in user_info[key].values():
email = re.search(r'([a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+)+', values)
if email != None:
print(key)
This code will print all the keys with the matched inner value. 此代码将打印具有匹配内部值的所有键。
Notice that in the code you have tried you didn't use values
at all. 注意,在您尝试过的代码中,您根本没有使用values
。
Looks like you want to extract the emails from the JSON values while also returning the matched key. 看起来您想从JSON值中提取电子邮件,同时还返回匹配的密钥。 Here are 2 solutions. 这里有2个解决方案。 The first one is similar to yours and the second one is generalized to any JSON with arbitrary levels. 第一个与您的相似,第二个被通用化为具有任意级别的任何JSON。
import re
user_info = {
"user1": {
"name": "Aby",
"surname": "Clark",
"description": "Hi contact me by phone +1 548 5455 55or facebook.com/aby.clark"
},
"user2": {
"name": "Marta",
"surname": "Bishop",
"description": "Nice to meet you text me"
},
"user3": {
"name": "Janice",
"surname": "Valinise",
"description": "You can contact me by phone +1 457 555667"
},
"user4": {
"name": "Helen",
"surname": "Bush",
"description": "You can contact me by phone +1 778 65422"
},
"user5": {
"name": "Janice",
"surname": "Valinise",
"description": "You can contact me by phone +1 457 5342327 or email janval@yahoo.com",
}
}
matches = []
for user, info in user_info.items():
for key, value in info.items():
emails = re.findall("([a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+)", value)
if emails:
matches.append((f'{user}.{key}', emails))
print(matches)
# -> [('user5.description', ['janval@yahoo.com'])]
import re
user_info = {
"user1": {
"name": "Aby",
"surname": "Clark",
"description": "Hi contact me by phone +1 548 5455 55or janval@yahoo.com",
"friends": [
{
"name": "Aby",
"surname": "Clark",
"description": "Hi contact me by phone +1 548 5455 55or janval@yahoo.com",
}
]
}
}
def traverse(obj, keys = []):
if isinstance(obj, str):
emails = re.findall("([a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+)", obj)
return [('.'.join(keys), emails)] if emails else []
if isinstance(obj, dict):
return [match for key, value in obj.items() for match in traverse(value, [*keys, key])]
if isinstance(obj, list):
return [match for i, value in enumerate(obj) for match in traverse(value, [*keys, str(i)])]
return []
print(traverse(user_info, []))
# -> [('user1.description', ['janval@yahoo.com']), ('user1.friends.0.description', ['janval@yahoo.com'])]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.