[英]Partition given string using regex
Trying to break the string into 2 parts. 尝试将字符串分成两部分。
#Need to get 'I1234' and 'I56/I78'
name1 = 'I1234/I56/I78'
#Need to get '\I1234 ' and 'I56/I78'
name2 = '\I1234 /I56/I78'
#Need to get '\I1234 ' and '\I56 /I78'
name3 = '\I1234 /\I56 /I78'
#Need to get '\1234 ' and '\I56 /\I78 '
name4 = '\I1234 /\I56 /\I78 '
I tried this, and it worked: 我尝试了一下,它起作用了:
pat_a = re.compile(r'(.+)(/)(.+)')
Is there a better way ?
result = re.findall(pat_a, name2[::-1])
There are more complicated strings possible, for example: 可能有更复杂的字符串,例如:
\I78_[0]/abcd_/efg_ /I1234/I56
Not sure if its better, but you can use partition
or split
with maxsplit=1 given to avoid the re
module import: 不知道它是否更好,但是您可以在给定maxsplit = 1的情况下使用partition
或split
,以避免re
模块导入:
print('I1234/I56/I78'.partition("/")) # ('I1234', '/', 'I56/I78')
print('I1234/I56/I78'.split("/",1)) # ['I1234', 'I56/I78']
For partition
you would need to look at the 0th and 2nd index of the tuple: 对于partition
您需要查看元组的第0和第2个索引:
first, _ , last = 'I1234/I56/I78'.partition("/")
Doku: 数独:
Full example: 完整示例:
name1 = 'I1234/I56/I78'
name2 = '\I1234 /I56/I78'
name3 = '\I1234 /\I56 /I78'
name4 = '\I1234 /\I56 /\I78 '
for n in [name1,name2,name3,name4]:
print(n.partition("/")) # ('I1234', '/', 'I56/I78')
print(n.split("/",1)) # ['I1234', 'I56/I78']
Output (backslashes are escaped - thats why they are doubled up): 输出(反斜杠被转义-这就是为什么将它们加倍):
('I1234', '/', 'I56/I78') # using partition
['I1234', 'I56/I78'] # using split
('\\I1234 ', '/', 'I56/I78') # partition
['\\I1234 ', 'I56/I78'] # split .. etc.
('\\I1234 ', '/', '\\I56 /I78')
['\\I1234 ', '\\I56 /I78']
('\\I1234 ', '/', '\\I56 /\\I78 ')
['\\I1234 ', '\\I56 /\\I78 ']
This answer uses string.split
, which seems to be the cleanest method over regex. 这个答案使用string.split
,这似乎是比regex最干净的方法。 I looked at using string.partition
, but it produces a tuple
, which requires index splitting. 我看着使用string.partition
,但它产生一个tuple
,这需要索引拆分。 Plus the output related to string.partition
doesn't give the output that you requested. 再加上与string.partition
相关的输出不会提供您要求的输出。
This first example takes a single string and outputs a pair of strings based on your split request. 第一个示例采用单个字符串,并根据您的拆分请求输出一对字符串。
# Need to get '\I1234 ' and '\I56 /I78'
name3 = '\I1234 /\I56 /I78'
# The input name (name3) can be change in a for loop linked to your input.
split_input = name3.split('/', 1) # maxsplit=1
print (split_input)
# outputs
#####################################################################
# NOTE: the escaped backslashes, which doesn't match your requirement.
#####################################################################
['\\I1234 ', '\\I56 /I78']
The original output above created escaped backslashes, so this code removes them. 上面创建的原始输出转义了反斜杠,因此此代码将其删除。
# Need to get '\I1234 ' and '\I56 /I78'
name3 = '\I1234 /\I56 /I78'
# The input name (name3) can be change in a for loop linked to your input.
split_input = str(name3.split('/', 1)).encode('utf-8').decode('unicode_escape')
print (split_input)
# outputs
['\I1234 ', '\I56 /I78'] # Do you need that trailing space?
I'm not sure where your input values are originally coming from (eg, file, website, etc.), so I added the ones from your question to a list for faster testing. 我不确定您的输入值最初来自何处(例如文件,网站等),因此我将问题中的输入值添加到列表中以进行更快的测试。 The next example use list comprehension and string.split. 下一个示例使用列表理解和string.split。
my_strings = ['I1234/I56/I78', '\I1234 /I56/I78', '\I1234 /\I56 /I78', '\I1234 /\I56 /\I78', '\I78_[0]/abcd_/efg_ /I1234/I56']
# Uses list comprehension and string.split to split the elements in your strings
split_input = [x.split('/', 1) for x in my_strings]
# The original output created escaped backslashes, so this code removes them.
decode_output = (str(split_input).encode('utf-8').decode('unicode_escape'))
print (decode_output)
# outputs
[['I1234', 'I56/I78'], ['\I1234 ', 'I56/I78'], ['\I1234 ', '\I56 /I78'], ['\I1234 ', '\I56 /\I78'], ['\I78_[0]', 'abcd_/efg_ /I1234/I56']]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.