[英]Remove empty brackets ( ) from string
I had a problem in removing empty brackets from string, I tried few methods didn't work.我在从字符串中删除空括号时遇到了问题,我尝试了几种方法都不起作用。 kindly help
好心的帮助
here is the dataframe这是 dataframe
data = {'disc': ['( ) -2,4-dichloro-a- ( chloromethyl ) -benzenemethanol methanesulfonate','( ) ( s ) -isopropyl 2 ','( 2s3s ) -12-epoxy-3- ( boc-amino ) -4-phenylbutane ( ) boc-epoxideide']}
df1 = pd.DataFrame(data)
print(df1)
which have multiple occurrence of ( )
need to remove only empty brackets.多次出现
( )
只需要删除空括号。
input:输入:
disc
0 ( ) -2,4-dichloro-a- ( chloromethyl ) -benzenemethanol methanesulfonate
1 ( ) ( s ) -isopropyl 2
2 ( 2s3s ) -12-epoxy-3- ( boc-amino ) -4-p
output: output:
disc
0 -2,4-dichloro-a- ( chloromethyl ) -benzenemethanol methanesulfonate
1 ( s ) -isopropyl 2
2 ( 2s3s ) -12-epoxy-3- ( boc-amino ) -4-phenylbutane boc-epoxideide
using replace is not helping because it will remove all brackets there in the string.使用替换没有帮助,因为它会删除字符串中的所有括号。
replace should work:替换应该工作:
a="'( ) -2,4-dichloro-a- ( chloromethyl ) -benzenemethanol"
>>> a.replace("( )","")
>>> "' -2,4-dichloro-a- ( chloromethyl ) -benzenemethanol"
import re You can try using regex module import re 你可以尝试使用正则表达式模块
df1["disc"] = df1["disc"].str.replace("\(\\s+\)", "")
\\s+
means it will detect one or spaces between two brackets \\s+
表示它将检测两个括号之间的一个或空格
-2,4-dichloro-a- ( chloromethyl ) -benzenemethanol methanesulfonate
( s ) -isopropyl 2
( 2s3s ) -12-epoxy-3- ( boc-amino ) -4-phenylbutane boc-epoxideide
pandas.DataFrame.replace does support using regex, so you can do: pandas.DataFrame.replace确实支持使用正则表达式,所以你可以这样做:
import pandas as pd
data = {'disc': ['( ) -2,4-dichloro-a- ( chloromethyl ) -benzenemethanol methanesulfonate','( ) ( s ) -isopropyl 2 ','( 2s3s ) -12-epoxy-3- ( boc-amino ) -4-phenylbutane ( ) boc-epoxideide']}
df1 = pd.DataFrame(data)
df2 = df1.replace(r'\s*\(\s*\)\s*', '', regex=True)
print(df2)
Output: Output:
disc
0 -2,4-dichloro-a- ( chloromethyl ) -benzenemeth...
1 ( s ) -isopropyl 2
2 ( 2s3s ) -12-epoxy-3- ( boc-amino ) -4-phenylb...
Note that you have to inform replace to use regular expression ( regex=True
) and I used so-called raw-string to simplify escaping, (
and )
needs to be escaped as they have special meaning in pattern, as for pattern itself I used 0 or more whitespaces ( /s*
) also before and after (
)
to also remove leading/trailing ones.请注意,您必须通知替换使用正则表达式(
regex=True
),我使用所谓的原始字符串来简化 escaping, (
和)
需要转义,因为它们在模式中具有特殊含义,至于我使用的模式本身在(
)
之前和之后也有 0 个或多个空格 ( /s*
),也可以删除前导/尾随空格。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.