[英]How to use regex to remove a particular pattern of words with numbers?
I have a string of words which generate different patterns of similar words through different audio file, I want to use regex pattern to get that pattern of words and remove it for the actual text.我有一串单词,它们通过不同的音频文件生成不同模式的相似单词,我想使用正则表达式模式来获取该单词模式并将其删除为实际文本。 For example I have the text below:
例如,我有以下文字:
text = "Yeah Cool\nSpeaker 100:00:03Uh, you know, when you score three goals, you expect to win a game, you know, but, uh,"
All I want to do is a regex pattern that can detect Speaker 100:00:03 and other similar pattern, depending on the audio file, at times i might have Speaker 100:00:01 which looks different from the first one but they are similar我想做的只是一个正则表达式模式,它可以检测扬声器 100:00:03和其他类似模式,具体取决于音频文件,有时我可能有扬声器 100:00:01 ,它看起来与第一个不同,但它们是相似的
Is there a better way to do this?有一个更好的方法吗?
I was using string replace
which is not a universal solution which is this:我使用的是字符串
replace
,这不是一个通用的解决方案,它是这样的:
new_text = text.replace('Speaker 000:00:00', '')
This is the expected result after applying regex which is what I'm expecting.这是应用正则表达式后的预期结果,这是我所期待的。
text = "Yeah Cool Uh, you know, when you score three goals, you expect to win a game, you know, but, uh,"
Depending on the exact format of the timestamp, re.sub
with the following pattern should work根据时间戳的确切格式,具有以下模式的
re.sub
应该可以工作
>>> re.sub('\nSpeaker \d{1,3}:\d{2}:\d{2}', ' ', text)
'Yeah Cool Uh, you know, when you score three goals, you expect to win a game, you know, but, uh,'
Very simple regular expression:非常简单的正则表达式:
import re
text = "Yeah Cool\nSpeaker 100:00:03Uh, you know, when you score three goals, you expect to win a game, you know, but, uh,"
re.sub(r'\nSpeaker \d\d\d:\d\d:\d\d', ' ', text)
# 'Yeah Cool Uh, you know, when you score three goals, you expect to win a game, you know, but, uh,'
“\nSpeaker \d{3}:\d{2}:\d{2}”
\d
detects a digit and {3}
means three times... so \d{3}
means three digits. \d
检测到一个数字, {3}
表示三次......所以\d{3}
表示三个数字。
Try regex101.com it's a great site to experiment with reflex.试试regex101.com这是一个试验反射的好地方。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.