使用正则表达式对字符串进行分区

Question

Trying to break the string into 2 parts. 尝试将字符串分成两部分。

#Need to get 'I1234' and 'I56/I78'
name1 = 'I1234/I56/I78'

#Need to get '\I1234 ' and 'I56/I78'
name2 = '\I1234 /I56/I78'      

#Need to get '\I1234 ' and '\I56 /I78'
name3 = '\I1234 /\I56 /I78'

#Need to get '\1234 ' and '\I56 /\I78 '
name4 = '\I1234 /\I56 /\I78 '

I tried this, and it worked: 我尝试了一下，它起作用了：

pat_a = re.compile(r'(.+)(/)(.+)')

Is there a better way ?

result = re.findall(pat_a, name2[::-1])

EDIT 编辑

There are more complicated strings possible, for example: 可能有更复杂的字符串，例如：

\I78_[0]/abcd_/efg_ /I1234/I56

Answer 1

Not sure if its better, but you can use partition or split with maxsplit=1 given to avoid the re module import: 不知道它是否更好，但是您可以在给定maxsplit = 1的情况下使用partition或split ，以避免re模块导入：

print('I1234/I56/I78'.partition("/"))   # ('I1234', '/', 'I56/I78')

print('I1234/I56/I78'.split("/",1))     # ['I1234', 'I56/I78']

For partition you would need to look at the 0th and 2nd index of the tuple: 对于partition您需要查看元组的第0和第2个索引：

first, _ , last = 'I1234/I56/I78'.partition("/")

Doku: 数独：

Full example: 完整示例：

name1 = 'I1234/I56/I78' 
name2 = '\I1234 /I56/I78'       
name3 = '\I1234 /\I56 /I78' 
name4 = '\I1234 /\I56 /\I78 '

for n in [name1,name2,name3,name4]:
    print(n.partition("/"))   # ('I1234', '/', 'I56/I78')
    print(n.split("/",1))     # ['I1234', 'I56/I78']

Output (backslashes are escaped - thats why they are doubled up): 输出（反斜杠被转义-这就是为什么将它们加倍）：

('I1234', '/', 'I56/I78')           # using partition
['I1234', 'I56/I78']                # using split

('\\I1234 ', '/', 'I56/I78')        # partition
['\\I1234 ', 'I56/I78']             # split .. etc.

('\\I1234 ', '/', '\\I56 /I78')
['\\I1234 ', '\\I56 /I78']

('\\I1234 ', '/', '\\I56 /\\I78 ')
['\\I1234 ', '\\I56 /\\I78 ']

Answer 2

This answer uses string.split , which seems to be the cleanest method over regex. 这个答案使用string.split ，这似乎是比regex最干净的方法。 I looked at using string.partition , but it produces a tuple , which requires index splitting. 我看着使用string.partition ，但它产生一个tuple ，这需要索引拆分。 Plus the output related to string.partition doesn't give the output that you requested. 再加上与string.partition相关的输出不会提供您要求的输出。

This first example takes a single string and outputs a pair of strings based on your split request. 第一个示例采用单个字符串，并根据您的拆分请求输出一对字符串。

# Need to get '\I1234 ' and '\I56 /I78'
name3 = '\I1234 /\I56 /I78'

# The input name (name3) can be change in a for loop linked to your input. 
split_input = name3.split('/', 1) # maxsplit=1
print (split_input)
# outputs 
#####################################################################
# NOTE: the escaped backslashes, which doesn't match your requirement. 
#####################################################################
['\\I1234 ', '\\I56 /I78']

The original output above created escaped backslashes, so this code removes them. 上面创建的原始输出转义了反斜杠，因此此代码将其删除。

# Need to get '\I1234 ' and '\I56 /I78'
name3 = '\I1234 /\I56 /I78'

# The input name (name3) can be change in a for loop linked to your input. 
split_input = str(name3.split('/', 1)).encode('utf-8').decode('unicode_escape')
print (split_input)
# outputs 
['\I1234 ', '\I56 /I78'] # Do you need that trailing space?

I'm not sure where your input values are originally coming from (eg, file, website, etc.), so I added the ones from your question to a list for faster testing. 我不确定您的输入值最初来自何处（例如文件，网站等），因此我将问题中的输入值添加到列表中以进行更快的测试。 The next example use list comprehension and string.split. 下一个示例使用列表理解和string.split。

my_strings = ['I1234/I56/I78', '\I1234 /I56/I78', '\I1234 /\I56 /I78', '\I1234 /\I56 /\I78', '\I78_[0]/abcd_/efg_ /I1234/I56']

# Uses list comprehension and string.split to split the elements in your strings
split_input = [x.split('/', 1) for x in my_strings]

# The original output created escaped backslashes, so this code removes them.
decode_output = (str(split_input).encode('utf-8').decode('unicode_escape'))

print (decode_output)
# outputs 
[['I1234', 'I56/I78'], ['\I1234 ', 'I56/I78'], ['\I1234 ', '\I56 /I78'], ['\I1234 ', '\I56 /\I78'], ['\I78_[0]', 'abcd_/efg_ /I1234/I56']]

使用正则表达式对字符串进行分区

问题描述

EDIT 编辑

2 个解决方案

解决方案1
5 2019-03-31 21:37:55

解决方案2
1 2019-04-01 00:53:25

使用正则表达式对字符串进行分区

问题描述

EDIT 编辑

2 个解决方案

解决方案1 5 2019-03-31 21:37:55

解决方案2 1 2019-04-01 00:53:25

解决方案1
5 2019-03-31 21:37:55

解决方案2
1 2019-04-01 00:53:25