简体   繁体   中英

Strip {…} or […] from string using TRegEx

I have the following functions, which should remove all occurrences of (...), [...] and {...} in a string

function TCleanUp.DoStripBraces(const aInput: string): string; // works!
begin
  result := TRegEx.Replace(aInput, '\([^)]*\)', '');
end;

function TCleanUp.DoStripCurlyBraces(const aInput: string): string; // does not work
begin
  result := TRegEx.Replace(aInput, '\{[^\}]*}', '');
end;

function TCleanUp.DoStripSquareBrackets(const aInput: string): string; // does not work
begin
  result := TRegEx.Replace(aInput, '\[[^\]]*]', '');
end;

I'm testing the functions with these strings

'foo (bar) baz (xyz)'
'foo [bar] baz [xyz]'
'foo {bar} baz {xyz}'

which all should return the following string

'foo  baz '

When I use the same strings and expressions on http://www.regexr.com/ it matches the occurrences perfectly.

I also tried to not escape the bracket / curly brace in the character set, but that did not work either.

How can I make the expressions work?

You could use a one regex like this:

[([{].*?[)}\]]

Working demo

On the other hand, if you want to have 3 separated regex you can use:

\(.*?\)
\[.*?\]
\{.*?\}

Putting them altogether, you can see what they match:

正则表达式可视化

These regexes above are more readable than have:

\([^)]*?\)     As you can see, this is error prone as you faced. 
\[[^\]]*?\]
\{[^}]*?\}

Although, the disadvantage of this readability impacts a little on the performance. Using .*? is slower than use [^...]* but unless you have to parse really long strings you won't notice the difference.

You can see the difference visually:

正则表达式可视化

You forgot to escape the latest occurrences of ] and } .

Try '\\{[^\\}]*\\}' and '\\[[^\\]]*\\]' .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM