简体   繁体   English

替换字符串中的换行符,json 中的换行符除外

[英]Replace newlines within string except for ones within json

I'm using this post as a reference for this question - How do I regex remove whitespace and newlines from a text, except for when they are in a json's string?我使用这篇文章作为这个问题的参考 - 如何正则表达式从文本中删除空格和换行符,除非它们在 json 的字符串中?

I having the following string in a java program:我在 java 程序中有以下字符串:

"stuff\n blah\n--payload {'meh': 'kar\n'}"

I'm looking for a regex to replace the newline characters in the entire string except for the one's within the JSON string.我正在寻找一个正则表达式来替换整个字符串中的换行符,除了 JSON 字符串中的换行符。 The result I'm expecting is:我期待的结果是:

"stuff blah --payload {'meh': 'kar\n'}"

The regex referenced in that post works fine for most cases, but replaces the \n within the JSON string as well.该帖子中引用的正则表达式在大多数情况下都可以正常工作,但也替换了 JSON 字符串中的\n The end result I get is:我得到的最终结果是:

"stuff blah --payload {'meh': 'kar'}"

I've been experimenting with the following set of regexes:我一直在尝试以下一组正则表达式:

^("[^"]*(?:""[^"]*)*")(\n+)  // I expected this to be a combination of newline and newline not within double quotes

[\n\r]\s*  //Match new lines, and then could possibly negate it to be within double quotes?

But I still can't seem to get the use case where the newline character within a JSON value string won't be ignored.但我似乎仍然无法获得不会忽略 JSON 值字符串中的换行符的用例。 Is there a possible solution?有没有可能的解决方案?

I believe you're over-complicating this in two ways:我相信你在两个方面过度复杂化了:

  1. Using regex for anything involving JSON.对涉及 JSON 的任何事情使用正则表达式。
  2. Trying to solve for the entire string at once.试图一次解决整个字符串。

JSON JSON

Regex + JSON, like Regex + HTML TO͇̹̺ͅƝ̴ȳ̳ TH̘Ë͖́̉ ͠P̯͍̭O̚N̐Y̡ H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ , just don't mix. REGEX + JSON,例如Regex + Z4C4C4AD5FCA2E7A3F74DBBB1CED00381AA4Z to͇̹̺ͅɲ̴ȳ̳p̯͍̭p̯͍̭o̚n̐y̐y̡y̡c̷̙̲̝͖ͭ̏ͥͮ͟c̷̙̲̝͖ͭ̏ͥͮ͟mͮ͏̮̪̝͍s没有。

Break the problem up打破问题

If the JSON is always at the end, and always delimited by a known string, you can:如果 JSON 始终位于末尾,并且始终由已知字符串分隔,您可以:

  • Split the string at the last delimiter ( --payload in your example).在最后一个分隔符处拆分字符串(在您的示例中为--payload )。
  • Process the first string (strip the newlines).处理第一个字符串(去掉换行符)。
  • Smush them back together.将它们重新组合在一起。
  • Profit.利润。

This might help:这可能会有所帮助:

public static void main(String[] args) {
    String input = "stuff\n blah\n--payload {'meh': 'kar\n'}";
    // Wanted output: Output: "stuff blah --payload {'meh': 'kar\n'}"

    String regexPayload = "--payload\\s[^\\}]+\\}";
    Matcher matcherExtractPayload = Pattern.compile(regexPayload, Pattern.DOTALL).matcher(input);
    Matcher matcherReplaceWithTag = Pattern.compile(regexPayload).matcher(input);

    String tag = "#PAYLOAD#";
    String taggedPayload = "EMPTY";
    String payLoad = "NO_PAYLOAD_FOUND";
    if(matcherExtractPayload.find()) {
        payLoad = matcherExtractPayload.group();
        taggedPayload = matcherReplaceWithTag.replaceFirst(tag);
    }

    String removedNewline = Pattern.compile("\n").matcher(taggedPayload).replaceAll("");
    String restoredPayload = removedNewline.replaceFirst(tag, " " + payLoad);

    System.out.println(restoredPayload); // Output: "stuff blah --payload {'meh': 'kar\n'}"
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM