简体   繁体   English

Python正则表达式以匹配和排除某些内容

[英]Python Regex to match and exclude certain content

I'm trying to exclude some content from a string. 我正在尝试从字符串中排除某些内容。 Here is an example: 这是一个例子:

Sony Xperia Z2 m/Smartwatch 2

and: 和:

Sony Xperia Z2 + headphones

I want to get only 我只想得到

Sony Xperia Z2

in both cases. 在两种情况下。

I have been able to match the string i want to get rid of with this but how do i select the inverse? 我已经能够匹配我想要摆脱的字符串,但是我该如何选择逆字符串呢? What i got so far: 我到目前为止所得到的:

 m/([a-zA-Z 0-9]*)

Edit: I have added another case. 编辑:我添加了另一种情况。

Using regex split 使用正则表达式拆分

re.split(r" m/| \+ ", yourString)[0]

This will work with both of your examples: 这将适用于您的两个示例:

string1 = "Sony Xperia Z2 m/Smartwatch 2"
print re.split(" m/| \+ ", string1)[0]
# output: Sony Xperia Z2

string2 = "Sony Xperia Z2 + headphones"
print re.split(" m/| \+ ", string2)[0]
# output: Sony Xperia Z2

And if you have more separator characters, you can add them to the pattern of the split function. 并且,如果您有更多的分隔符,则可以将它们添加到split函数的模式中。

You can also use re.split(...)[1] to retrieve the second part of your string: 您还可以使用re.split(...)[1]来检索字符串的第二部分:

string1 = "Sony Xperia Z2 m/Smartwatch 2"
print re.split(" m/| \+ ", string1)[1]
# output: Smartwatch 2

You can use: 您可以使用:

>>> s = 'Sony Xperia Z2 m/Smartwatch 2'
>>> re.sub(r'\s*m/.*$', '', s)
'Sony Xperia Z2'

Using Regex 使用正则表达式

>>> re.findall(r"[a-zA-Z0-9 ]+(?= m/)", "Sony Xperia Z2 m/Smartwatch 2")
['Sony Xperia Z2']

>>> re.findall(r"[a-zA-Z0-9 ]+(?= m/)", "Sony Xperia Z2 m/Smartwatch 2")[0]
'Sony Xperia Z2'

Using Split 使用分割

>>> "Sony Xperia Z2 m/Smartwatch 2".split(" m/")[0]
'Sony Xperia Z2'

Something like : 就像是 :

test = 'Sony Xperia Z2 m/Smartwatch 2'
res = re.search('m/([a-zA-Z 0-9]*)', test)
cleanstr = test.replace(res.group(), '')
print cleanstr

And you got Sony Xperia Z2 然后您得到了Sony Xperia Z2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM