简体   繁体   English

Python Regex仅匹配每个单词大写的位置

[英]Python Regex match only where every word is capitalized

I would like to match all strings where all words are capitilized. 我想匹配所有单词都被限制的所有字符串。

At the moment I have tried something like this: 目前我尝试过这样的事情:

list = ["This sentence should Not Match", "This Should Only Match"]
match = []
for l in list:
   x = re.search("^[A-Z]*.", l)
   if x:
      match.append(l)

For example I would like the regex to match only something like: "This Is A Good Example Here", but it should not match: "Something like this Here", "HERE Is an example that Should NOT Match", "TiHiS SeNtEnEcE" or "This Should NOT Match.Foo" 例如,我希望正则表达式只匹配:“这是一个很好的例子在这里”,但它不应该匹配:“像这样的东西”,“这里是一个不应该匹配的例子”,“TiHiS SeNtEnEcE”或者“这不应该匹配.Foo”

I am looping over lots of news articles and trying to match all the titles. 我正在循环播放大量新闻文章并尝试匹配所有标题。 These titles usually have every word capitalized. 这些标题通常都是大写的。

You can do without regex: 你可以没有正则表达式:

l = ["This sentence should Not Match", "This Should Only Match"]
[s for s in l if s.istitle()]

Output: 输出:

['This Should Only Match']

Try matching using re.search with the following pattern: 尝试使用以下模式使用re.search进行匹配:

^[A-Z][a-z]*(?: [A-Z][a-z]*)*$

Script: 脚本:

list = ["This sentence should Not Match", "This Should Only Match"]
matches = []
for l in list:
    x = re.search("^[A-Z][a-z]*(?: [A-Z][a-z]*)*$", l)
    if x:
        matches.append(l)

print(matches)

This prints: 这打印:

['This Should Only Match']

I support Chris' solution foremost, but here's a possible regex approach: 我首先支持Chris的解决方案,但这是一种可能的正则表达式方法:

import re

sentences = ["This sentence should Not Match", "This Should Only Match"]
result = [x for x in sentences if re.match(r"^([A-Z][a-z]*\b\s*)+$", x)]
print(result) # => ["This Should Only Match"]

The regex only matches strings with one or more of a single capital letter followed by 0 or more lowercase letters, a word boundary and optional spaces. 正则表达式仅匹配具有一个或多个单个大写字母的字符串,后跟0或更多小写字母,单词边界和可选空格。

Note: try to avoid overwriting the builtin function list() and it's a good habit to always make regex strings raw. 注意:尽量避免覆盖内置函数list()并且总是使正则表达式字符串原始是一个好习惯。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM