正则表达式仅匹配列表中的一些元素

Question

Here's a python-code-snippet: 这是一个python代码段：

import re

VARS='Variables: "OUTPUTFOLDER=installers","SETUP_ORDER=Product 4,Product 4  Library","SUB_CONTENTS=Product 4 Library","SUB_CONTENT_SIZES=9364256","SUB_CONTENT_GROUPS=Product 4 Library","SUB_CONTENT_DESCRIPTIONS=","SUB_CONTENT_GROUP_DESCRIPTIONS=","SUB_DISCS=Product 4,Product Disc",SUB_FILENAMES='
comp = re.findall(r'\w+=".*?"', VARS)

for var in comp:
    print var

This is the output currently: 这是当前的输出：

SUB_CONTENT_DESCRIPTIONS="," 
SUB_CONTENT_GROUP_DESCRIPTIONS=","

However I'd like the output to extract all elements so it looks like this: 但是我希望输出提取所有元素，因此看起来像这样：

"OUTPUTFOLDER=installers"
"SETUP_ORDER=Product 4, Product 4 Library"
"SUB_CONTENTS=Product 4"
"SUB_CONTENT_SIZES=9364256"
...

What is wrong with my regex-pattern? 我的正则表达式模式有什么问题？

Answer 1

Use this regex. 使用此正则表达式。

comp = re.findall(r'"\w+=.*?"', VARS)

Results: 结果：

>>> 
"OUTPUTFOLDER=installers"
"SETUP_ORDER=Product 4,Product 4  Library"
"SUB_CONTENTS=Product 4 Library"
"SUB_CONTENT_SIZES=9364256"
"SUB_CONTENT_GROUPS=Product 4 Library"
"SUB_CONTENT_DESCRIPTIONS="
"SUB_CONTENT_GROUP_DESCRIPTIONS="
"SUB_DISCS=Product 4,Product Disc"

In my opinion, you could do this in a more clever way, and store your "vars" in a dictionary. 我认为，您可以采用更聪明的方式来完成此操作，然后将“ vars”存储在字典中。

d = dict(var.strip('"').split('=') for var in re.findall(r'"\w+=.*?"', VARS))

To see the dictionary: 要查看字典：

for k, v in d.items():
    print k, '=', (v if v else '<NONE>')

Results: 结果：

>>> 
SETUP_ORDER = Product 4,Product 4  Library
SUB_CONTENT_DESCRIPTIONS = <NONE>
SUB_DISCS = Product 4,Product Disc
SUB_CONTENT_GROUPS = Product 4 Library
SUB_CONTENT_SIZES = 9364256
SUB_CONTENT_GROUP_DESCRIPTIONS = <NONE>
OUTPUTFOLDER = installers
SUB_CONTENTS = Product 4 Library

Answer 2

Use this regex: 使用此正则表达式：

r'"\w+?=.*?"'

The difference between my and your regexes, see for yourself: 我和您的正则表达式之间的区别，请自己看看：

r'"\w+?=.*?"' # mine
r'\w+=".*?"' # your's

Just one " . 只是一个" 。

Output: 输出：

>>> regex = re.compile(r'"\w+?=.*?"')
>>> regex.findall(string)
[u'"OUTPUTFOLDER=installers"', u'"SETUP_ORDER=Product 4,Product 4 Library"',  
 u'"SUB_CONTENTS=Product 4 Library"', u'"SUB_CONTENT_SIZES=9364256"',
 u'"SUB_CONTENT_GROUPS=Product 4 Library"', u'"SUB_DISCS=Product 4,Product Disc"']

Answer 3

You could try this: 您可以尝试以下方法：

 comp = re.findall(r'"(.*?)"', VARS)
 print [x for x in comp]

Roughly you are getting whatever comes within the double quotes in a non greedy manner. 大致上，您会以非贪婪的方式获得双引号中包含的所有内容。

正则表达式仅匹配列表中的一些元素

问题描述

3 个解决方案

解决方案1
1 2013-04-23 12:06:32

解决方案2
1 已采纳 2013-04-23 12:12:07

解决方案3
1 2013-04-23 12:15:17

正则表达式仅匹配列表中的一些元素

问题描述

3 个解决方案

解决方案1 1 2013-04-23 12:06:32

解决方案2 1 已采纳 2013-04-23 12:12:07

解决方案3 1 2013-04-23 12:15:17

解决方案1
1 2013-04-23 12:06:32

解决方案2
1 已采纳 2013-04-23 12:12:07

解决方案3
1 2013-04-23 12:15:17