[英]I want to split a string by a character on its first occurence, which belongs to a list of characters. How to do this in python?
Basically, I have a list of special characters. 基本上,我有一个特殊字符列表。 I need to split a string by a character if it belongs to this list and exists in the string. 如果字符串属于此列表并且存在于字符串中,则需要按字符将其拆分。 Something on the lines of: 在以下方面:
def find_char(string):
if string.find("some_char"):
#do xyz with some_char
elif string.find("another_char"):
#do xyz with another_char
else:
return False
and so on. 等等。 The way I think of doing it is: 我的想法是:
def find_char_split(string):
char_list = [",","*",";","/"]
for my_char in char_list:
if string.find(my_char) != -1:
my_strings = string.split(my_char)
break
else:
my_strings = False
return my_strings
Is there a more pythonic way of doing this? 有没有更Python的方式来做到这一点? Or the above procedure would be fine? 还是上面的程序会好吗? Please help, I'm not very proficient in python. 请帮助,我不太精通python。
(EDIT): I want it to split on the first occurrence of the character, which is encountered first. (编辑):我希望它在第一次出现的字符出现时分裂。 That is to say, if the string contains multiple commas, and multiple stars, then I want it to split by the first occurrence of the comma . 也就是说,如果字符串包含多个逗号和多个星星,那么我希望它按逗号的第一次出现进行拆分 。 Please note, if the star comes first, then it will be broken by the star. 请注意,如果首先出现星星,那么星星将把它打破。
I would favor using the re
module for this because the expression for splitting on multiple arbitrary characters is very simple: 我倾向于使用re
模块,因为用于拆分多个任意字符的表达式非常简单:
r'[,*;/]'
The brackets create a character class that matches anything inside of them. 方括号创建一个与其中任何内容匹配的字符类。 The code is like this: 代码是这样的:
import re
results = re.split(r'[,*;/]', my_string, maxsplit=1)
The maxsplit
argument makes it so that the split only occurs once. 使用maxsplit
参数可以使拆分仅发生一次。
If you are doing the same split many times, you can compile the regex and search on that same expression a little bit faster (but see Jon Clements' comment below ): 如果您多次进行相同的拆分,则可以编译正则表达式并在相同的表达式上搜索更快一点(但请参见下面的Jon Clements的评论 ):
c = re.compile(r'[,*;/]')
results = c.split(my_string)
If this speed up is important (it probably isn't) you can use the compiled version in a function instead of having it re compile every time. 如果提高速度很重要(可能不重要),则可以在函数中使用已编译的版本,而不是每次都重新编译它。 Then make a separate function that stores the actual compiled expression: 然后创建一个单独的函数来存储实际的已编译表达式:
def split_chars(chars, maxsplit=0, flags=0, string=None):
# see note about the + symbol below
c = re.compile('[{}]+'.format(''.join(chars)), flags=flags)
def f(string, maxsplit=maxsplit):
return c.split(string, maxsplit=maxsplit)
return f if string is None else f(string)
Then: 然后:
special_split = split_chars(',*;/', maxsplit=1)
result = special_split(my_string)
But also: 但是也:
result = split_chars(',*;/', my_string, maxsplit=1)
The purpose of the +
character is to treat multiple delimiters as one if that is desired (thank you Jon Clements). +
字符的目的是在需要时将多个定界符视为一个(谢谢乔恩·克莱门茨)。 If this is not desired, you can just use re.compile('[{}]'.format(''.join(chars)))
above. 如果不希望这样做,则可以只使用上面的re.compile('[{}]'.format(''.join(chars)))
。 Note that with maxsplit=1
, this will not have any effect. 请注意,对于maxsplit=1
,这将没有任何效果。
Finally: have a look at this talk for a quick introduction to regular expressions in Python, and this one for a much more information packed journey. 最后:看看这次谈话的简单介绍正则表达式在Python,和这一个了更多的信息打包的旅程。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.