简体   繁体   English

使用 python 在任何特殊字符上拆分字符串

[英]split string on any special character using python

currently I can have many dynamic separators in string like目前我可以在字符串中有很多动态分隔符,比如

new_123_12313131
new$123$12313131
new#123#12313131

etc etc. I just want to check if there is a special character in string then just get value after last separator like in this example just want 12313131等等等等。我只是想检查字符串中是否有特殊字符然后在最后一个分隔符之后获取值,就像在这个例子中只想要12313131

just get value after last separator仅在最后一个分隔符后获取值

the more obvious way is using re.findall :更明显的方法是使用re.findall

from re import findall

findall(r'\d+$',text)  # ['12313131']

This is a good use case for isdigit() :这是isdigit()的一个很好的用例:

l = [
'new_123_12313131',
'new$123$12313131',
'new#123#12313131',
]

output = []
for s in l:
    temp = ''
    for char in s:
        if char.isdigit():
            temp += char
    output.append(temp)        

print(output)

Result : ['12312313131', '12312313131', '12312313131']结果:['12312313131','12312313131','12312313131']

Assuming you define 'special character' as anything thats not alphanumeric, you can use the str.isalnum() function to determine the first special character and leverage it something like this:假设您将“特殊字符”定义为任何非字母数字,您可以使用str.isalnum() function 来确定第一个特殊字符并利用它,如下所示:

def split_non_special(input) -> str:
    """
    Find first special character starting from the end and get the last piece
    """
    for i in reversed(input):
        if not i.isalnum():
            return input.split(i)[-1] # return as soon as a separator is found
    return '' # no separator found

# inputs = ['new_123_12313131', 'new$123$12313131', 'new#123#12313131', 'eefwfwrfwfwf3243']
# outputs = [split_non_special(input) for input in inputs]
# ['12313131', '12313131', '12313131', ''] # outputs

Python supplies what seems to be what you consider "special" characters using the string library as string.punctuation . Python 使用字符串库作为string.punctuation提供您认为的“特殊”字符。 Which are these characters:这些字符是什么:

!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~

Using that in conjunction with the re module you can do this:将其与re模块结合使用,您可以执行以下操作:

from string import punctuation
import re

re.split(f"[{punctuation}]", my_string)

my_string being the string you want to split. my_string是您要拆分的字符串。

Results for your examples您的示例的结果

['new', '123', '12313131']

To get just digits you can use:要仅获取数字,您可以使用:

re.split("\d", my_string)

Results:结果:

['123', '12313131']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM