[英]Split string by first substring found
我希望在第一次出现这些词时,用某些词语来分句。 让我说明一下:
message = 'I wish to check my python code for errors to run the program properly with fluency'
我希望在第一次出现for/to/with
拆分上面的消息,因此上面消息的结果将check my python code for errors to run the program properly with fluency
我还希望包含我将句子拆分的单词,因此我的最终结果将是: to check my python code for errors to run the program properly with fluency
我的代码不起作用:
import re
message = 'I wish to check my python code for errors to run the program properly with fluency'
result = message.split(r"for|to|with",1)[1]
print(result)
我能做什么?
message = 'I wish to check my python code for errors to run the program properly with fluency'
array = message.split(' ')
number = 0
message_new = ''
for i in range(len(array)):
if array[i] == 'to' or array[i] == 'for':
number=i
break
for j in range(number,len(array)):
message_new += array[j] + ' '
print(message_new)
输出:
to check my python code for errors to run the program properly with fluency
split
不会将正则表达式作为参数(也许你正在考虑Perl)。
以下是您想要的:
import re
message = 'I wish to check my python code for errors to run the program properly with fluency'
result = re.search(r'\b(for|to|with)\b', message)
print message[result.start(1):]
这不使用替换,重新加入或循环,而只是简单搜索所需的字符串并使用其位置结果。
这个问题已经回答: 如何删除python中特定字符之前的所有字符,但它只适用于一个特定的分隔符,对于多个分隔符,你首先要找出哪个首先出现,可以在这里找到: 怎么能我发现python字符串中第一次出现一个子字符串,你从第一个猜测开始,我没有太多的想象力所以让我们称之为bestDelimiter = firstDelimiter,找出它第一次出现的位置,将位置保存到bestPosition =第一次出现的位置,继续找出其余分隔符的位置,每次你找到一个在当前bestPosition之前出现的分隔符你更新两个变量bestDelimiter和bestPosition,最后出现的那个是最好的分辨符,然后使用bestDelimiter继续应用您需要的操作
我的猜测是,这个简单的表达可能就是这么做的
.*?(\b(?:to|for|with)\b.*)
和re.match
可能是这五种方法中最快的一种:
re.findall
测试 import re
regex = r".*?(\b(?:to|for|with)\b.*)"
test_str = "I wish to check my python code for errors to run the program properly with fluency"
print(re.findall(regex, test_str))
re.sub
测试 import re
regex = r".*?(\b(?:to|for|with)\b.*)"
test_str = "I wish to check my python code for errors to run the program properly with fluency"
subst = "\\1"
result = re.sub(regex, subst, test_str)
if result:
print (result)
re.finditer
测试 import re
regex = r".*?(\b(?:to|for|with)\b.*)"
test_str = "I wish to check my python code for errors to run the program properly with fluency"
matches = re.finditer(regex, test_str, re.MULTILINE)
for matchNum, match in enumerate(matches, start=1):
# FULL MATCH
print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))
for groupNum in range(0, len(match.groups())):
groupNum = groupNum + 1
print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))
re.match
测试 import re
regex = r".*?(\b(?:to|for|with)\b.*)"
test_str = "I wish to check my python code for errors to run the program properly with fluency"
print(re.match(regex, test_str).group(1))
re.search
测试 import re
regex = r".*?(\b(?:to|for|with)\b.*)"
test_str = "I wish to check my python code for errors to run the program properly with fluency"
print(re.search(regex, test_str).group(1))
如果您希望进一步探索或修改它,可以在本演示的右上方面板中解释该表达式,如果您愿意,可以在此链接中查看它与某些示例输入的匹配情况。
您可以先找到for
, to
和with
所有实例,拆分所需的值,然后拼接并重新加入:
import re
message = 'I wish to check my python code for errors to run the program properly with fluency'
vals, [_, *s] = re.findall(r"\bfor\b|\bto\b|\bwith\b", message), re.split(r"\bfor\b|\bto\b|\bwith\b", message)
result = ''.join('{} {}'.format(a, re.sub("^\s+", "", b)) for a, b in zip(vals, s))
输出:
'to check my python code for errors to run the program properly with fluency'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.