简体   繁体   English

连续标点和字母数字字符

[英]consecutive punctuation and alpha-numeric characters

Probably a simple question, but I don't have a lot of regex experience. 可能是一个简单的问题,但是我没有太多的正则表达式经验。 I would like to take a string and select all the consecutive punctuation characters and all the consecutive alpha-numeric characters 我想取一个字符串并选择所有连续的标点字符和所有连续的字母数字字符

This is as close as I could get 这是我所能接近的

r="my9zza :)asax"
import re
re.findall(r'(\w+)|([^a-zA-Z0-9\s]+)', r)

returns 退货

[('my9zza', ''), ('', ':)'), ('asax', '')]

but I would like 但我想

['my9zza', ':)', 'asax']

Simply use: 只需使用:

r = "my9zza :)asax"
import re
print(re.findall(r'\w+|[^a-zA-Z0-9\s]+', r))

The problem was having two sets of parentheses in your original code, causing findall to return a 2-ple. 问题是原始代码中有两组括号,导致findall返回2倍。

If you wanted to keep the original regex, you can also easily transform your result into the desired output with: 如果要保留原始的正则表达式,还可以使用以下方法轻松地将结果转换为所需的输出:

[x[0] or x[1] for x in result]

You can try this: 您可以尝试以下方法:

s = [('my9zza', ''), ('', ':)'), ('asax', '')]
final_s = [[b for b in i if b][0] for i in s]

Output: 输出:

['my9zza', ':)', 'asax']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM