[英]Python split list values based on condition
给定一个基于某些条件的python列表拆分值:
list = ['(( value(name) = literal(luke) or value(like) = literal(music) )
and (value(PRICELIST) in propval(valid))',
'(( value(sam) = literal(abc) or value(like) = literal(music) ) and
(value(PRICELIST) in propval(valid))']
现在list [0]将是
(( value(name) = literal(luke) or value(like) = literal(music) )
and (value(PRICELIST) in propval(valid))
我想要拆分,以便在迭代时得到:
#expected output
value(sam) = literal(abc)
value(like) = literal(music)
如果它以值和文字开头,也是如此。 起初,我想到了使用and和or进行拆分,但由于有时可能会缺少and和or,因此无法正常工作。
我试过了 :
for i in list:
i.split()
print(i)
#output ['((', 'value(abc)', '=', 'literal(12)', 'or' ....
我也欢迎基于正则表达式的建议。 但我对此一无所知,我宁愿不包含它
@Duck_dragon
在开头的帖子列表中,您的字符串的格式设置会导致在Python中引起语法错误。 在下面的示例中,我将其编辑为使用'''
>>> import re
>>> list = ['''(( value(name) = literal(luke) or value(like) = literal(music) )
and (value(PRICELIST) in propval(valid))''',
'''(( value(sam) = literal(abc) or value(like) = literal(music) ) and
(value(PRICELIST) in propval(valid))''']
#Simple findall without setting it equal to a variable so it returns a list of separate strings but which you can't use
#You can also use the *MORE SIMPLE* but less flexible regex: '([a-zA-Z]+\([a-zA-Z]+\)[\s=]+[a-zA-Z]+\([a-zA-Z]+\))'
>>> for item in list:
re.findall('([a-zA-Z]+(?:\()[a-zA-Z]+(?:\))[\s=]+[a-zA-Z]+(?:\()[a-zA-Z]+(?:\)))', item)
['value(name) = literal(luke)', 'value(like) = literal(music)']
['value(sam) = literal(abc)', 'value(like) = literal(music)']
。
为了更进一步,并提供一个数组,您可以使用:
>>> import re
>>> list = ['''(( value(name) = literal(luke) or value(like) = literal(music) )
and (value(PRICELIST) in propval(valid))''',
'''(( value(sam) = literal(abc) or value(like) = literal(music) ) and
(value(PRICELIST) in propval(valid))''']
#Declaring blank array found_list which you can use to call the individual items
>>> found_list = []
>>> for item in list:
for element in re.findall('([a-zA-Z]+(?:\()[a-zA-Z]+(?:\))[\s=]+[a-zA-Z]+(?:\()[a-zA-Z]+(?:\)))', item):
found_list.append(element)
>>> found_list
['value(name) = literal(luke)', 'value(like) = literal(music)', 'value(sam) = literal(abc)', 'value(like) = literal(music)']
。
考虑到您的评论,我无法完全理解,这是您想要的吗? 我更改了列表以添加您提到的其他值:
>>> import re
>>> list = ['''(( value(name) = literal(luke) or value(like) = literal(music) )
and (value(PRICELIST) in propval(valid))''',
'''(( value(sam) = literal(abc) or value(like) = literal(music) ) and
(value(PRICELIST) in propval(valid))''',
'''(value(PICK_SKU1) = propval(._sku)''', '''propval(._amEntitled) > literal(0))''']
>>> found_list = []
>>> for item in list:
for element in re.findall('([\w\.]+(?:\()[\w\.]+(?:\))[\s=<>(?:in)]+[\w\.]+(?:\()[\w\.]+(?:\)))', item):
found_list.append(element)
>>> found_list
['value(name) = literal(luke)', 'value(like) = literal(music)', 'value(PRICELIST) in propval(valid)', 'value(sam) = literal(abc)', 'value(like) = literal(music)', 'value(PRICELIST) in propval(valid)', 'value(PICK_SKU1) = propval(._sku)', 'propval(._amEntitled) > literal(0)']
。
编辑:或者这是您想要的?
>>> import re
>>> list = ['''(( value(name) = literal(luke) or value(like) = literal(music) )
and (value(PRICELIST) in propval(valid))''',
'''(( value(sam) = literal(abc) or value(like) = literal(music) ) and
(value(PRICELIST) in propval(valid))''']
#Declaring blank array found_list which you can use to call the individual items
>>> found_list = []
>>> for item in list:
for element in re.findall('([a-zA-Z]+(?:\()[a-zA-Z]+(?:\))[\s=<>(?:in)]+[a-zA-Z]+(?:\()[a-zA-Z]+(?:\)))', item):
found_list.append(element)
>>> found_list
['value(name) = literal(luke)', 'value(like) = literal(music)', 'value(PRICELIST) in propval(valid)', 'value(sam) = literal(abc)', 'value(like) = literal(music)', 'value(PRICELIST) in propval(valid)']
让我知道您是否需要解释。
。
@费奥多尔·库特谢潘
在您的示例中,取出your_list_
并用OP的list
替换,以免造成混淆。 其次,您的for loop
缺少:
产生语法错误
因此,为了避免造成混乱,我将在此评论中解释解决方案。 我希望可以。
鉴于您的评论(我无法完全理解),这是您想要的吗? 我更改了列表以添加您提到的其他值:
>>> import re
>>> list = ['''(( value(name) = literal(luke) or value(like) = literal(music) )
and (value(PRICELIST) in propval(valid))''',
'''(( value(sam) = literal(abc) or value(like) = literal(music) ) and
(value(PRICELIST) in propval(valid))''',
'''(value(PICK_SKU1) = propval(._sku)''', '''propval(._amEntitled) > literal(0))''']
>>> found_list = []
>>> for item in list:
for element in re.findall('([\w\.]+(?:\()[\w\.]+(?:\))[\s=<>(?:in)]+[\w\.]+(?:\()[\w\.]+(?:\)))', item):
found_list.append(element)
>>> found_list
['value(name) = literal(luke)', 'value(like) = literal(music)', 'value(PRICELIST) in propval(valid)', 'value(sam) = literal(abc)', 'value(like) = literal(music)', 'value(PRICELIST) in propval(valid)', 'value(PICK_SKU1) = propval(._sku)', 'propval(._amEntitled) > literal(0)']
说明:
[a-zA-Z0-9\\._]+
更改为[\\w\\.]+
因为它们本质上是相同的,但更为简洁。 我将在下一步解释这些查询涵盖哪些字符 ([\\w\\.]+
表示它是“未封闭的”,这意味着我正在启动正则表达式以捕获以下查询中的所有内容,我告诉它首先捕获处于az
, AZ
,和_
,以及转义符( .
) (?:\\()
我是说捕获的查询应包含一个转义的“开头”括号( (
) [\\w\\.]+(?:\\))
我是说再加上第二步中概述的单词character,但是这次是通过(?:\\))
我要说的是转义的“结束”括号( )
) [\\s=<>(?:in)]+
有点鲁ck,但是出于可读性考虑,并假设您的字符串将保持相对一致,这表示,“结束括号”后应跟"whitespace"
, a =
,a <
,a >
或in
中的单词,可以按任意顺序使用,但是很多次它们都一致地出现。 这是鲁re的,因为它还会匹配<< <
, = in > =
等内容。但是,使其更具体可能很容易导致捕获丢失。 [\\w\\.]+(?:\\()[\\w\\.]+(?:\\))
再说一遍,找到步骤1中的单词character,然后是“开括号”,后面再加上单词字符,再加上“右括号” )
我正在关闭“未关闭”捕获组(记住上面开始的第一个捕获组为“未关闭”),以告诉regex引擎捕获我概述的整个查询 希望这可以帮助
首先,我建议您避免像内置函数那样命名变量。 其次,如果要获得上述输出,则不需要正则表达式。
例如:
first, rest = your_list_[1].split(') and'):
for item in first[2:].split('or')
print(item)
并不是说您应该这样做,但是您绝对可以在这里使用PEG
解析器:
from parsimonious.grammar import Grammar
from parsimonious.nodes import NodeVisitor
data = ['(( value(name) = literal(luke) or value(like) = literal(music) ) and (value(PRICELIST) in propval(valid))',
'(( value(sam) = literal(abc) or value(like) = literal(music) ) and (value(PRICELIST) in propval(valid))']
grammar = Grammar(
r"""
expr = term (operator term)*
term = lpar* factor (operator needle)* rpar*
factor = needle operator needle
needle = word lpar word rpar
operator = ws? ("=" / "or" / "and" / "in") ws?
word = ~"\w+"
lpar = "(" ws?
rpar = ws? ")"
ws = ~r"\s*"
"""
)
class HorribleStuff(NodeVisitor):
def generic_visit(self, node, visited_children):
return node.text or visited_children
def visit_factor(self, node, children):
output, equal = [], False
for child in node.children:
if (child.expr.name == 'needle'):
output.append(child.text)
elif (child.expr.name == 'operator' and child.text.strip() == '='):
equal = True
if equal:
print(output)
for d in data:
tree = grammar.parse(d)
hs = HorribleStuff()
hs.visit(tree)
这产生
['value(name)', 'literal(luke)']
['value(sam)', 'literal(abc)']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.