简体   繁体   English

正则表达式在要拆分的字符串中识别 JSON 使用:分隔符

[英]Regex to identify JSON in a String to split using : separator

I have an input String which is being split using : (colon) as separator.我有一个输入字符串,它使用: (冒号)作为分隔符进行拆分。 This String can contain all string values OR possibly a json object itself as part of this incoming string.此字符串可以包含所有字符串值或可能包含 json object 本身作为此传入字符串的一部分。

For example:例如:

Case#1: 123:HARRY_POTTER:ENGLAND:MALE案例#1: 123:HARRY_POTTER:ENGLAND:MALE

Case#2: 123:HARRY_POTTER:[{"key":"City", "value":"LONDON"}]:MALE案例#2: 123:HARRY_POTTER:[{"key":"City", "value":"LONDON"}]:MALE

There is code in place that uses str.split(":") which is handling the case#1, but for case#2 since the json part of the string contains : (which are to be ignored while splitting), the program breaks.有代码使用str.split(":")处理案例#1,但对于案例#2,因为字符串的 json 部分包含:拆分时将被忽略),程序中断.

I need a regex that could (1) identify the json in string and (2) a regex that would not split if : is preceded and followed by " ( ":" ) as it appears in JSON string.我需要一个正则表达式,它可以 (1) 识别字符串中的 json 和 (2) 一个不会拆分的正则表达式,如果:之前和之后是" ( ":" ),因为它出现在 JSON 字符串中。

So if the string is identifed to contain json i can use str.split(<regex-to-split-string-with-json>)因此,如果字符串被识别为包含 json 我可以使用str.split(<regex-to-split-string-with-json>)

I arrived at these regex to match for a " preceding to : none of which are working unfortunately:我到达这些正则表达式以匹配"之前的:不幸的是,它们都不起作用:

Negative Look Behind: (?<:\"): and (?<:\")[:]负面看后面: (?<:\"):(?<:\")[:]

Positive Look Behind: (?<=\"): and (?<=\")[:]正面看后面: (?<=\"):(?<=\")[:]

Please suggest!请建议!

Please try regex (?<=[a-zA-Z0-9\\]]): its working as expected for your both the cases:请尝试正则表达式(?<=[a-zA-Z0-9\\]]):它在两种情况下都按预期工作:

import java.util.Arrays;

public class Solution {
    public static void main(String[] args) {
        String str = "123:HARRY_POTTER:[{\"key\":\"City\", \"value\":\"LONDON\"}]:MALE";
        String regex = "(?<=[a-zA-Z0-9\\]]):";
        String arr[] = str.split(regex);
        System.out.println("Length: " + arr.length);
        System.out.println(Arrays.toString(arr));
    }
}

Output: Output:

Length: 4
[123, HARRY_POTTER, [{"key":"City", "value":"LONDON"}], MALE]

For the example data, perhaps it would be enough to either match from [ till ] or match any char except :对于示例数据,也许匹配 from [ until ]或匹配除 char 之外的任何字符就足够了:

But as this data comes in mixed, it is not easy to determine what is actual valid json.但由于这些数据混杂在一起,很难确定什么是实际有效的 json。 You would still have to validate that afterwards.之后您仍然需要验证这一点。

Note that this is a brittle solution, and does not take any nesting of the square brackets into account.请注意,这是一个脆弱的解决方案,并且没有考虑方括号的任何嵌套。

\[[^]\[\r\n]+]|[^\r\n:]+
  • \[[^]\[\r\n]+] Match from [...] \[[^]\[\r\n]+]匹配来自[...]
  • | Or或者
  • [^\r\n:]+ Match 1+ times any char except : or a newline [^\r\n:]+匹配 1+ 次除:或换行符以外的任何字符

See a regex and Java demo.请参阅正则表达式Java演示。

Example例子

String regex = "\\[[^]\\[\\r\\n]+]|[^\\r\\n:]+";
String string = "123:HARRY_POTTER:[{\"key\":\"City\", \"value\":\"LONDON\"}]:MALE";

Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);

while (matcher.find()) {
    System.out.println(matcher.group(0));
}

Output Output

123
HARRY_POTTER
[{"key":"City", "value":"LONDON"}]
MALE

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM