简体   繁体   English

删除两个字符串之间的每个字符串,并在特定字符串之后出现

[英]Removing each String between two strings and occuring after a specific string

I have a String in Java : 我在Java中有一个String:

style="hello World">One-time meetings< style=\\"Hello Again"> stop "Hello" style =“ hello World”>一次性会议<style = \\“ Hello Again”>停止“ Hello” 样本输入 I want to remove all the strings that are between " ", occurring immediately after each occurrence of the String "Style". 我想删除字符串“ Style”每次出现后立即出现的所有“”之间的字符串。

So, after the removal, the above String will look like : 因此,删除后,上面的String看起来像:

style="">One-time meetings< style=\\""> stop "Hello" style =“”>一次性会议<style = \\“”>停止“ Hello”

~Thanks 〜谢谢

If you want to remove all strings that are between the quotes in the style attribute then a simple replaceAll() should do the trick: 如果要删除style属性中引号之间的所有字符串,则可以使用简单的replaceAll()来解决问题:

String input = "style=\"hello World\">One-time meetings< style=\"Hello Again\"> stop \"Hello\"";
input = input.replaceAll("style=\"(.*?)\"", "style=\"\"");

Update: 更新:

From inspecting your raw input, it appears that the quotes inside the <style> tags themselves are already escaped by a single backslash. 通过检查原始输入,看来<style>标记本身中的引号已经被单个反斜杠转义了。 If this be the case, then the following replacement should give you what you want: 如果是这种情况,那么以下替换应会为您提供所需的东西:

String input = "style=\\\"hello World\\\">One-time meetings< style=\\\"Hello Again\\\"> stop \"Hello\"";
input = input.replaceAll("style=\\\\\"(.*?)\\\\\"", "style=\\\\\"\\\\\"?");

I think that parsing HTML with regex is a bad idea . 我认为用regex解析HTML是一个坏主意
Please use a parser, like JSoup 请使用解析器,例如JSoup

Example code: 示例代码:

Document doc = Jsoup.parse(html);
doc.select(".style").attr("style", null);
String htmlWithoutStyle = doc.outerHtml();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM