简体   繁体   English

在 Java 中用正则表达式去除和替换文本字符串

[英]strip and replace a text string with a regular expression in Java

I am trying to strip and replace a text string in the most elegant way possible:我试图以最优雅的方式剥离和替换文本字符串:

With the solution I have /element\\s*\\{"([^"]+)"\\}\\s*{text\\s*{\\s*}\\s*({[^}]*})/使用解决方案我有/element\\s*\\{"([^"]+)"\\}\\s*{text\\s*{\\s*}\\s*({[^}]*})/

text.replaceAll("element\\s*\\{\"([^\"]+)\"}\\s*\\{text\\s*\\{\\s*}\\s*(\\{[^}]*})", "<$1> $2"));

Used on the text below:用于以下文本:

element {"item"} {text { } {$i/child::itemno} text { } {$i/child::description} text { } element {"high_bid"} {{max($b/child::bid)}} text { }} 

GIVES:给:

<item> {$i/child::itemno} text { } {$i/child::description} text { } element {"high_bid"} {{max($b/child::bid)}} text { }}

When I'm trying to achieve:当我试图实现:

<item>{$i/child::itemno}{$i/child::description}<high_bid>{fn:max($b/child::bid)}</high_bid></item> 

After reviewing, the problem is that the regex only matches once.复查后,问题是正则表达式只匹配一次。

Your regex is looking for element{"tag"} {text { } {text_here}您的正则表达式正在寻找 element{"tag"} {text { } {text_here}

This only occurs once in your input:这仅在您的输入中发生一次:

element {"item"} {text { } {$i/child::itemno}

Nothing else matches:没有其他匹配项:

text { } element {"high_bid"} {   => NO MATCH, text without element before it

element {"high_bid"} {{max($b/child::bid)}} text { }   => NO MATCH, text after braces

So either your input is bad, or you need something better than a one-shot regex.所以要么你的输入不好,要么你需要比一次性正则表达式更好的东西。

That being said, I don't think a regex will work here.话虽如此,我认为正则表达式在这里不起作用。 You could remove all of the "text { }" elements, which seem to do nothing:您可以删除所有“text { }”元素,这些元素似乎什么都不做:

text.replaceAll("text\\s*\\{\\s*}", ""));

Which gives you:这给了你:

element {"item"} { {$i/child::itemno}  {$i/child::description}  element {"high_bid"} {{max($b/child::bid)}} }

But the problem here is that you have nesting.但这里的问题是你有嵌套。 If you are simply matching on braces, how do you know how far to match?如果你只是在大括号上匹配,你怎么知道匹配多远? You need your regex to comprehend how many opening braces you have, and find the correct closing brace.您需要正则表达式来理解您有多少个左大括号,并找到正确的右大括号。 This is not really doable with regular expressions.这对于正则表达式来说并不是真的可行。 You need a function that parses the string counting opening braces and subtracting closing braces.您需要一个函数来解析计数左大括号和减去右大括号的字符串。 When you get a count of zero, you found a set... Of course, this is not regular expressions.当你计数为零时,你发现了一个集合......当然,这不是正则表达式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM