简体   繁体   English

正则表达式-如何捕获许多单词

[英]Regex - how to capture many words

I have a simple regex question: 我有一个简单的正则表达式问题:

Given a string like "test-class" what regex should I use to get ['test','class'] (in python context) 给定像"test-class"这样的字符串,我应该使用什么正则表达式来获取['test','class'] (在python上下文中)

You don't need a regex; 您不需要正则表达式; just use str.split() : 只需使用str.split()

>>> 'test-class'.split('-')
['test', 'class']

A regex solution is still to split: 正则表达式解决方案仍在拆分:

>>> import re
>>> re.split(r'-', 'test-class')
['test', 'class']
"(\w+)"g

example here : http://regex101.com/r/mV9cE2 此处的示例: http : //regex101.com/r/mV9cE2

\\w will match an return group of all alphanumeric characters \\w将匹配所有字母数字字符的返回组

g modifier: global. g修饰符:全局。 All matches (don't return on first match) 所有比赛(在第一个比赛中不返回)

([a-zA-Z]*)就足以捕获字符串中的每个单词。

If you are intent on using regex: 如果您打算使用正则表达式:

In short you define a regex which matches the things you want. 简而言之,您可以定义一个与所需内容匹配的正则表达式。 Then you use regex.matchall to the string, and you get back the matching parts. 然后,对字符串使用regex.matchall ,然后取回匹配的部分。

import re
$ s = 'hello-world this 32'
$ results = re.findall(r'[a-zA-Z]*', s)
$ print(results)
['hello', '', 'world', '', 'this', '', '', '', '']
# Now we can filter out the empty results.
$ non_empty_results = [result for result in results if result]
$ print(non_empty_results)
['hello', 'world', 'this']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM