[英]Java REGEX - Not able to remove content inside tag
This is my input text: 这是我的输入文字:
[QUOTE=SynapseBreak;104047835]Armchio de dragon is satki dragon lai de leh
[URL="https://play.google.com/store/apps/details?id=com.shiportal.hwzreader&referrer=utm_source%3Dsignature%26utm_medium%3Dforum"]Sent from 權志-龍 using GAGT[/URL][/QUOTE]
why satki ? tell me :s13:
[QUOTE=articated;104047854]I not sad lah
U happy i happy kym
Just for fun loh :s12:
[ms]自從我變成了狗屎,就再也沒有人敢踩在我頭上了 HardwareZone Forums app[/ms][/QUOTE]
today arti jin sweet make me happy :s12:
[QUOTE=Iandao;104047967]Gg mbs now...[/QUOTE]
go there jiak simi ??
I am trying to remove all the content inside [QUOTE] [/QUOTE] tags and the tags themselves. 我正在尝试删除[QUOTE] [/ QUOTE]标签和标签本身中的所有内容。
I want the output to be : 我希望输出为:
why satki ? tell me :s13: today arti jin sweet make me happy :s12: go there jiak simi ??
The code i tried is: 我试过的代码是:
string.replaceAll("\\[QUOTE.*\\[/QUOTE\\]", "")
Note that you may use the following fix for your pattern only if the input does not contain nested [QUOTE]
tages . 请注意, 仅当输入不包含嵌套的
[QUOTE]
标记时,才可以对模式使用以下修复程序。
A .
一
.
in your regex does not match line breaks, and .*
is too greedy, ie will match up to the last occurrence of [/QUOTE]
on a line/in a string. 正则表达式中的字符与换行符不匹配,并且
.*
过于贪婪,即与行中/字符串中最后一次出现的[/QUOTE]
匹配。
Use lazy dot matching with the Pattern.DOTALL
inline modifier (embedded flag option) (?s)
that will force the .
将懒点匹配与
Pattern.DOTALL
内联修饰符(嵌入式标志选项) (?s)
,它将强制使用.
to match any char: 匹配任何字符:
"(?s)\\[QUOTE=.*?\\[/QUOTE\\]"
^^^^ ^^^
See this regex demo . 请参阅此正则表达式演示 。
Or, unroll the lazy dot (to make the pattern find matches faster) as: 或者,展开惰性点(以使模式查找更快地匹配)为:
"\\[QUOTE=[^\\[]*(?:\\[(?!/QUOTE\\])[^\\[]*)*\\[/QUOTE\\]"
See this regex demo . 请参阅此正则表达式演示 。
String pat = "\\[QUOTE=[^\\[]*(?:\\[(?!/QUOTE])[^\\[]*)*\\[/QUOTE]";
String str = "[QUOTE=SynapseBreak;104047835]Armchio de dragon is satki dragon lai de leh\n\n[URL=\"https://play.google.com/store/apps/details?id=com.shiportal.hwzreader&referrer=utm_source%3Dsignature%26utm_medium%3Dforum\"]Sent from 權志-龍 using GAGT[/URL][/QUOTE]\nwhy satki ? tell me :s13:\n[QUOTE=articated;104047854]I not sad lah\nU happy i happy kym\n\nJust for fun loh :s12:\n[ms]自從我變成了狗屎,就再也沒有人敢踩在我頭上了 HardwareZone Forums app[/ms][/QUOTE]\ntoday arti jin sweet make me happy :s12:\n\n[QUOTE=Iandao;104047967]Gg mbs now...[/QUOTE]\ngo there jiak simi ??'";
String res = str.replaceAll(pat, "");
System.out.println(res);
// => why satki ? tell me :s13:
//
// today arti jin sweet make me happy :s12:
//
//
// go there jiak simi ??'
Your regex is not taking new lines into account. 您的正则表达式未考虑换行。 This is done by adding (?s) at the beginning.
这是通过在开头添加(?s)来完成的。
string.replaceAll("(?s)\\[QUOTE.*?\\[/QUOTE\\]", "");
(?s)\\[QUOTE.*?\\[/QUOTE\\]
Try the above RegEx. 试试上面的RegEx。 It will work.
它会工作。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.