[英]Regex to split by comma, but ignore commas proceeding words near a colon
我正在尝试使用 python 用逗号分隔字符串,但允许用户在某些密钥对中包含逗号。 这是我正在使用的字符串的两个示例:
title.search:The relation between visualization size, grouping, and user performance,publication_year:2020
author.id:c33432,title.search:The relation between visualization size, grouping, and user performance,publication_year:2020
我希望它变成:
["title.search:The relation between visualization size, grouping, and user performance", "publication_year:2020"]
["author.id:c33432", "title.search:The relation between visualization size, grouping, and user performance", "publication_year:2020"]
对我有帮助的是,冒号之前的部分(键)将始终以三种格式之一编写,例如:
所以它可以是一个单词,两个单词之间用句点隔开,或者三个单词用句点隔开。
关于这是否可能的任何想法?
据我所见,您试图在文本中用逗号分隔,在这种情况下,正则表达式是\w,\w
。
请您尝试以下方法:
#!/usr/bin/python
import re
s = ['title.search:The relation between visualization size, grouping, and user performance,publication_year:2020',
'author.id:c33432,title.search:The relation between visualization size, grouping, and user performance,publication_year:2020']
for str in s:
m = re.split(r',(?=\s*[\w.]+:)', str)
print(m)
Output:
['title.search:The relation between visualization size, grouping, and user performance', 'publication_year:2020']
['author.id:c33432', 'title.search:The relation between visualization size, grouping, and user performance', 'publication_year:2020']
正则表达式,(?=\s*[\w.]+:)
匹配逗号后跟
为了。
然后将字符串拆分为满足上述条件的逗号。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.