简体   繁体   中英

Removing from a string all the characthers included between two specific characters in Python

在Python中从字符串中取出两个特定字符之间包含的所有字符的快速方法是什么?

You can use this regular expression: \\(.*?\\) . Demo here: https://regexr.com/3jgmd

Then you can remove the part with this code:

import re
test_string = 'This is a string (here is a text to remove), and here is a text not to remove'
new_string = re.sub(r" \(.*?\)", "", test_string)

This regular expression (regex) will look for any text (without line break) in brackets prepended by a space

You will most probably use a regular expression like

\s*\([^()]*\)\s*

for that (see a demo on regex101.com ).
The expression removes everything in parentheses and surrounding whitespaces.


In Python this could be:

 import re test_string = 'This is a string (here is a text to remove), and here is a text not to remove' new_string = re.sub(r'\\s*\\([^()]*\\)\\s*', '', test_string) print(new_string) # This is a string, and here is a text not to remove 


However, for learning purposes, you could as well go with the builtin methods:

 test_string = 'This is a string (here is a text to remove), and here is a text not to remove' left = test_string.find('(') right = test_string.find(')', left) if left and right: new_string = test_string[:left] + test_string[right+1:] print(new_string) # This is a string , and here is a text not to remove 

Problem with the latter: it does not account for multiple occurences and does not remove whitespaces but it is surely faster.


Executing this a 100k times each, the measurements yield:

 0.578398942947 # regex solution 0.121736049652 # non-regex solution 

to remove all text in ( and ) you can use findall() method from re and remove them using replace() :

import re
test_string = 'This is a string (here is a text to remove), and here is a (second one) text not to remove'
remove = re.findall(r" \(.*?\)",test_string)
for r in remove:
    test_string = test_string.replace(r,'')
print(test_string)
#result: This is a string , and here is a  text not to remove

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM