[英]What's the proper way to exclude uppercase word/s in regex python
Let's say I've scrapped this from a website.假设我已经从网站上删除了它。
PARIS - Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua (2015).巴黎 - Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua (2015)。 Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat 22/05/2015.
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat 22/05/2015。 Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur。 Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
Here's my code to get the first string for my third question, assume that text is variable that contains these texts这是我的代码,用于获取第三个问题的第一个字符串,假设文本是包含这些文本的变量
location = re.findall('^\w+', text)
Use a regular expression that matches a sequence of uppercase letters and spaces followed by a hyphen at the beginning, and replaces it with an empty string.使用匹配一系列大写字母和空格后跟开头连字符的正则表达式,并将其替换为空字符串。
text = re.sub(r'^[A-Z\s]+\s-\s*', '', text)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.