简体   繁体   English

有没有一种简单的方法可以删除 Python 中字符串中间的括号内不必要的空格?

[英]Is there an easy way to remove unnecessary whitespaces inside of brackets that are in the middle of a string in Python?

I've strings in the form of:我有以下形式的字符串:

s = "Wow that is really nice, ( 2.1 ) shows that according to the drawings in ( 1. 1) and a) there are errors."

and I would like to get a cleaned string in the form of:我想以以下形式获得一个干净的字符串:

s = "Wow that is really nice, (2.1) shows that according to the drawings in (1.1) and a) there are errors."

I tried to fix it with regex:我试图用正则表达式修复它:

import re

regex = r" (?=[^(]*\))"
s = "Wow that is really nice, ( 2.1 ) shows that according to the drawings in ( 1. 1) and a) there are some errors."
re.sub(regex, "", s)

But I get faulty results like this: Wow that is really nice, (2.1) shows that according to the drawings in (1.1)anda) there are some errors.但是我得到了这样的错误结果: Wow that is really nice, (2.1) shows that according to the drawings in (1.1)anda) there are some errors.

Does anyone know how to deal with this problem when you don't always have the same number of opening and closing brackets?当您不总是有相同数量的左括号和右括号时,有谁知道如何处理这个问题?

I am not sure about that, but you can try to do the following:我不确定,但您可以尝试执行以下操作:

s = s.replace('( ','(')
s = s.replace(' )',')')

Here replace(old, new) is standard function, that replace old string to the new one.这里的 replace(old, new) 是标准的 function,将旧字符串替换为新字符串。 I hope it will help.我希望它会有所帮助。

If the only whitespace you want to remove are the ones that occur directly after an opening bracket (or before a closing), then a simple string replace might work:如果您要删除的唯一空格是直接出现在左括号之后(或右括号之前)的空格,那么简单的字符串替换可能会起作用:

>>> s.replace("( ", "(").replace(" )", ")")
'Wow that is really nice, (2.1) shows that according to the drawings in (1. 1) and a) there are errors.'

You can match all the inner-most parentheneses with simple regex, and then perform a substitution on the matches to remove all the whitespaces.您可以使用简单的正则表达式匹配所有最里面的括号,然后对匹配项执行替换以删除所有空格。

import re
s = "Wow that is really nice, ( 2.1 ) shows that according to the drawings in ( 1. 1) and a) there are errors."
regex = r"\([^\(\)]*\)"
res = re.sub(regex, lambda s: s[0].replace(" ", ""), s)

print(res)

try尝试

 r" (?=[^()]*\))"

This excludes 'close parenthesis' from the things that can be inside a pair of parentheses.这从可以在一对括号内的内容中排除“右括号”。

Whether this works will depends whether you have nested brackets in your text.这是否有效取决于您的文本中是否有嵌套括号。

Nested brackets is not something that can be solved with regex- you need a parser (it may need to count the brackets)嵌套括号不是可以用正则表达式解决的 - 你需要一个解析器(它可能需要计算括号)

If you also want to match balanced parenthesis and remove the spaces, you can make use of the PyPi regex module and a recursive pattern如果您还想匹配平衡括号并删除空格,您可以使用PyPi 正则表达式模块和递归模式

\([^)(]*+(?:(?R)[^)(]*)*+\)

See a regex demo .查看正则表达式演示

Note that it will remove all spaces.请注意,它将删除所有空格。

import regex

pattern = r"\([^)(]*+(?:(?R)[^)(]*)*+\)"

s = ("Wow that is really nice, ( 2.1 ) shows that according to the drawings in ( 1. 1) and a) there are errors.\n"
"Wow that is really nice, ( 2.1 (2.1 ( 1,3 ) ) )shows that according to the drawings in ( 1. 1) and a) there are errors.")

print(regex.sub(pattern, lambda m: m[0].replace(" ", ""), s))

Output Output

Wow that is really nice, (2.1) shows that according to the drawings in (1.1) and a) there are errors.
Wow that is really nice, (2.1(2.1(1,3)))shows that according to the drawings in (1.1) and a) there are errors.

To only remove the spaces after the ( and before the )仅删除()之前的空格

import regex

pattern = r"\([^)(]*+(?:(?R)[^)(]*)*+\)"

s = "Wow that is really nice, ( test in 2.1 (2.1 test( 1,3 test ) ) )shows that according to the drawings in ( 1. 1) and a) there are errors."

print(regex.sub(pattern, lambda m: regex.sub(r"(?<=\() +| +(?=\))", "", m[0]), s))

Output Output

Wow that is really nice, (test in 2.1 (2.1 test(1,3 test)))shows that according to the drawings in (1. 1) and a) there are errors.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM