简体   繁体   English

从字符串中获取特定信息

[英]Get particular information from a string

I want to get the value of name from fstr using RegEx in Python. 我想在Python中使用RegEx从fstr获取name的值。 I tried as below, but couldn't find the intended result. 我尝试如下,但找不到预期的结果。

Any help will be highly appreciaaed. 任何帮助将不胜感激。

fstr = "MCode=1,FCode=1,Name=XYZ,Extra=whatever" #",Extra=whatever" this portion is optional
myobj = re.search( r'(.*?),Name(.*?),*(.*)', fstr, re.M|re.I)
print(myobj.group(2))

You may not believe, but the actual problem was ,* , in your regular expression. 您可能不相信,但实际问题是正则表达式中的,* It makes matching , optional. 它使匹配,可选的。 So, the second capturing group in your regex matches nothing ( .*? means match between zero to unlimited and match lazily) and it checks the next item ,* , it also means match , zero or more times. 因此,正则表达式中的第二个捕获组不匹配任何内容( .*?表示零到无限制之间的匹配,并且延迟进行匹配),并检查下一个项目,* ,也表示match ,零次或多次。 So it matches zero times and the last capturing groups matches the rest of the string. 因此它匹配零次,最后捕获组匹配字符串的其余部分。

If you want to fix your RegEx, you can simply remove the * after the comma, like this 如果要修复RegEx,只需在逗号后删除* ,就像这样

myobj = re.search( r'(.*?),Name(.*?),(.*)', fstr, re.I)
print(myobj.group(2))
# =XYZ

Online RegEx demo (with the mistake) 在线RegEx演示(错误)

Online RegEx demo (after fixing it) 在线RegEx演示(修复后)

正则表达式可视化

Debuggex Demo Debuggex演示

But as the other answer shows, you don't have to create additional capture groups. 但是,正如其他答案所示,您不必创建其他捕获组。

BTW, I like to use RegEx only when it is particularly needed. 顺便说一句,我喜欢仅在特别需要时才使用RegEx。 In this case, I would have solved it, without RegEx, like this 在这种情况下,如果没有RegEx,我会像这样解决它

fstr = "MCode=1,FCode=1,Name=XYZ,Extra=whatever"
d = dict(item.split("=") for item in fstr.split(","))
# {'FCode': '1', 'Extra': 'whatever', 'Name': 'XYZ', 'MCode': '1'}

Now that I have all the information, I can access them like this 现在,我已经掌握了所有信息,可以像这样访问它们

print d["Name"]
# XYZ

Simple, huh? 简单吧? :-) :-)

Edit: If you want to use the same regex for one million records, we can slightly improve the performance by precompiling the RegEx, like this 编辑:如果要对一百万条记录使用相同的正则表达式,则可以通过预编译RegEx来稍微提高性能,如下所示

import re
pattern = re.compile(r"Name=([^,]+)", re.I)
match = re.search(pattern, data)
if match:
    match.group(1)

You can do it as follows: 您可以按照以下步骤进行操作:

import re

fstr = "MCode=1,FCode=1,Name=XYZ,Extra=whatever"

myobj = re.search( r'Name=([^,]+)', fstr, re.M|re.I)

>>> print myobj.group(1)
XYZ

try it 试试吧

rule = re.compile(r"Name=(?P<Name>\w*),")
res = rule.search(fstr)
res.group("Name")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM