简体   繁体   English

re.match()。groups()如何工作?

[英]How does re.match().groups() work?

I am trying to strip a selection of information from a string using re.match().groups(): 我试图使用re.match()。groups()从字符串中删除选择的信息:

s = "javascript:Add2ShopCart(document.OrderItemAddForm,%20'85575',%20'Mortein%20Mouse%20Trap%201%20pack',%20'',%20'$4.87');"

The result I want is: 我想要的结果是:

("Mortein%20Mouse%20Trap%201%20pack", "4.87")

So I have been trying: 所以我一直在尝试:

re.match(r"(SEPARATOR)(SEPARATOR)", s).groups() #i.e.:
re.match(r"(\',%20\')(\$)", s).groups()

I have tried looking at the re documentation , but as my regexing skills are so sub-par it's not helping me much. 我已经尝试过查看重新记录文档了 ,但由于我的regexing技能非常低,所以对我没什么帮助。

More sample input: 更多样本输入:

javascript:Add2ShopCart(document.OrderItemAddForm,%20'85575',%20'Mortein%20Mouse%20Trap%201%20pack',%20'',%20'$4.87');

javascript:Add2ShopCart(document.OrderItemAddForm_0,%20'85575',%20'Mortein%20Mouse%20Trap%201%20pack',%20'',%20'$4.87');

javascript:Add2ShopCart(document.OrderItemAddForm,%20'8234551',%20'Mortein%20Naturgard%20Fly%20Spray%20Eucalyptus%20320g',%20'',%20'$7.58');

javascript:Add2ShopCart(document.OrderItemAddForm,%20'4204369',%20'Mortein%20Naturgard%20Insect%20Killer%20Automatic%20Outdoor%20Refill%20152g',%20'',%20'$15.18');

javascript:Add2ShopCart(document.OrderItemAddForm_0,%20'4204369',%20'Mortein%20Naturgard%20Insect%20Killer%20Automatic%20Outdoor%20Refill%20152g',%20'',%20'$15.18');

javascript:Add2ShopCart(document.OrderItemAddForm,%20'4220523',%20'Mortein%20Naturgard%20Outdoor%20Automatic%20Prime%201%20pack',%20'',%20'$32.54');
re.findall(r"""
   '          #apostrophe before the string Mortein
   (          #start capture
   Mortein.*? #the string Moretein plus everything until...
   )          #end capture
   '          #...another apostrophe
   .*         #zero or more characters
   \$         #the literal dollar sign
   (          #start capture
   .*?        #zero or more characters until...
   )          #end capture
   '          #an apostrophe""", s, re.X)

This will return an array with the Mortein and $ amounts as a tuple. 这将返回一个包含Mortein$ Mortein作为元组的数组。 You can also use: 您还可以使用:

re.search(r"'(Mortein.*?)'.*\$(.*?)'", s)

This returns a match. 这会返回一个匹配项。 .group(1) is Moretein and .group(2) is $ . .group(1)MoreteinMoretein .group(2)$ .group(0) is the entire string that was matched. .group(0)是匹配的整个字符串。

You can use 您可以使用

javascript:Add2ShopCart.*?,.*?,%20'(.*?)'.*?\$(\d+(?:\.\d+)?)

Group 1,2 captures what you want. 第1,2组捕捉到你想要的东西。

not a regex one shot, hope it helps: 不是一个正则表达式,希望它有所帮助:

In [16]: s="""s = javascript:Add2ShopCart(document.OrderItemAddForm,%20'85575',%20'Mortein%20Mouse%20Trap%201%20pack',%20'',%20'$4.87');"""

In [17]: arr=s.split("',%20'")

In [18]: arr[1]
Out[18]: 'Mortein%20Mouse%20Trap%201%20pack'

In [19]: re.findall("(?<=\$)[^']*",arr[3])
Out[19]: ['4.87']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM