简体   繁体   English

模式中的模式?

[英]Pattern within a pattern?

I want to capture Alta, Utah, USA from asd Alta, Utah, USA qwe . 我想从Alta, Utah, USA asd Alta, Utah, USA qwe捕获Alta, Utah, USA asd Alta, Utah, USA qwe Basically I'm trying to capture places from a text. 基本上,我试图从文本中捕获位置。 It won't be a perfect method, but the places must start with a capital and use a comma, followed by another word with a capital. 这不是一个完美的方法,但是场所必须以大写字母开头并使用逗号,然后再加上一个大写字母。

So far, I have wrote: 到目前为止,我已经写道:

\s[A-Z][a-z]+[,]?

I want to do multiple words, not just the first word, Alta . 我想输入多个单词,而不仅仅是第一个单词Alta This is my attempt to use square brackets inside other square brackets. 这是我尝试在其他方括号内使用方括号。

[\s[A-Z][a-z]+[,]?]+

But that doesn't work, so it must be syntactically incorrect. 但这不起作用,因此在语法上必须是错误的。

Updated as per OP's comment: 根据OP的评论进行了更新:

(?:\s*[A-Z][A-Za-z]+[,\s])+

Demo 演示版

Original Answer: 原始答案:

\b([A-Z][a-zA-Z]+),?

Original Demo 原始演示

And you will get the names of the country in group 1 for each match 您将在每次比赛中获得第1组的国家名称

I think this is what you need: 我认为这是您需要的:

([A-Z][a-zA-Z]+)(,\s*([A-Z][a-zA-Z]+))*

Though the requirement pointed out by @Rizwan (in his comment) is still to be understood. 尽管@Rizwan指出的要求(在他的评论中)仍有待理解。

在此处输入图片说明

Debuggex Demo Debuggex演示

Just joining the party: 刚参加聚会:

import re
dirty = "asd Alta, Utah, USA qwe"
p = re.compile("([A-Z][a-zA-Z]+)")
re.findall(p,dirty)

output: 输出:

['Alta', 'Utah', 'USA']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM