简体   繁体   English

Python在字符前后搜索/提取字符串

[英]Python search / extract string before and after a character

Need help in extracting string before and after a character using regex in python 在python中使用正则表达式提取字符前后的字符串时需要帮助

string = "My City | August 5"

I would like to extract "My City" and extract "August 5" 我想提取"My City"并提取"August 5"

string1 = "My City"
string2 = "August 5"

You don't need a regex here, just use str.partition() , splitting on the | 您在这里不需要正则表达式,只需使用str.partition() ,在|拆分即可。 plus the surrounding spaces: 加上周围的空间:

string1, separator, string2 = string.partition(' | ')

Demo: 演示:

>>> string = "My City | August 5"
>>> string.partition(' | ')
('My City', ' | ', 'August 5')
>>> string1, separator, string2 = string.partition(' | ')
>>> string1
'My City'
>>> string2
'August 5'

str.partition() splits the string just once ; str.partition()将字符串分割一次 if there are more | 如果还有更多| characters those are left as part of string2 . 这些字符作为string2一部分保留。

If you want to make it a little more robust and handle any number of spaces around the pipe symbol, you can split on just | 如果你想多一点健壮和处理任何数量的周围管道符号空间,你只能分割上| and use str.strip() to remove arbitrary amounts of whitespace from the start and end of the two strings: 并使用str.strip()从两个字符串的开头和结尾删除任意数量的空格:

string1, separator, string2 = map(str.strip, string.partition('|'))

You don't need regular expressions here. 您在这里不需要正则表达式。 Just type 只需输入

string = "My City | August 5"
string1, string2 = string.split("|")

If you want to crop the trailing space in the results, you can use 如果要在结果中裁剪尾随空间,可以使用

string1 = string1.strip(" ")
string2 = string2.strip(" ")

Sure, here's a built in method for splitting a string: 当然,这是分割字符串的内置方法:

string = "My City | August 5"
delimiter = ' | '  # note the spaces are part of your delimiter
list_of_partial_strings = string.split(delimiter)

There's excellent documentation available for string methods in the Python Standard Library documentation . Python标准库文档中提供了有关字符串方法的出色文档

Through regex, it would be like 通过正则表达式,就像

>>> import re
>>> string = "My City | August 5"
>>> string1, string2 = re.split(r'\s+\|\s+', string)
>>> string1
'My City'
>>> string2
'August 5'

\\s+ matches one or more space characters, \\| \\s+匹配一个或多个空格字符, \\| matches a literal | 匹配文字| symbol . 符号。 You must need to escape the | 您必须逃脱| in your regex to match a literal | 在正则表达式中以匹配文字| symbol because | 符号,因为| pipe is a special meta character in regex which was usually called as alternation operator or logical OR operator. 管道是正则表达式中的特殊元字符,通常称为交替运算符或逻辑OR运算符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM