简体   繁体   English

Python Regexp:在包含OR和AND的查询字符串中查找所有单词/词组

[英]Python Regexp: find all words/phrases within a querystring containing OR and AND

I have a querystring like this: 我有这样的查询字符串:

s = 'word1 AND word2 word3 OR "word4 word5" OR word6 AND word7 word8'

I need to find all words or phrases within OR and AND, so the results will be a list like this (preferably without the spaces between AND/OR and the word/phrase): 我需要在OR和AND中找到所有单词或短语,因此结果将是这样的列表(最好在AND / OR与单词/短语之间没有空格):

l = ['word1', 'word2 word3', '"word4 word5"', 'word6', 'word7 word8']

I've tried messing around with regular expressions but could't find a way to do this. 我尝试弄乱正则表达式,但找不到解决方法。

Thanks for the help. 谢谢您的帮助。

If you want to use regexps, re.split should do it: 如果要使用正则表达式,则re.split应该这样做:

re.split(' OR | AND ', 'word1 AND word2 word3 OR "word4 word5" OR word6 AND word7 word8')
['word1', 'word2 word3', '"word4 word5"', 'word6', 'word7 word8']

If you need a bigger hammer, you could check out something like pyparsing: http://pyparsing.wikispaces.com/file/view/searchparser.py 如果您需要更大的锤子,可以查看pyparsing之类的内容: http ://pyparsing.wikispaces.com/file/view/sea​​rchparser.py

IMO you should instead use IMO,您应该改用

s.split(' AND ') 
s.split(' OR ')

or if the spacing is irregular, use 或者如果间距不规则,请使用

s.split('AND') 
s.split('OR')

then loop and .strip() each element 然后循环和.strip()每个元素

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM