[英]Use Regex to extract a json object from a large text-input file
I have a large textfile with nonsense and a json object somewhere in it.我有一个带有废话的大文本文件和一个 json object 在其中某处。 I knew that the json object has a textfile-far unique keyword so I'll look for this unique keyword.我知道 json object 有一个 textfile-far 唯一关键字,所以我会寻找这个唯一关键字。 I knew this word is every time in the object and every time under the "root" location.我知道这个词每次都在 object 并且每次都在“根”位置下。 Here is an Example json-string这是一个示例 json-string
....
{
"key0":"value0",
"key1":"value0",
"key2":"value0",
"uniqueKey":"value0",
"key0":[
{"key0":"value0","key1":"value1"}
]
}
....
so I had wrote this method to extract the json object: It works find but I thought - regex?所以我写了这个方法来提取 json object:它可以找到但我想 - 正则表达式?
private JsonObject parse(String text, String keywordInJsonFile) {
int index = text.indexOf(keywordInJsonFile);
int lastIndex = text.lastIndexOf(keywordInJsonFile);
if (index != lastIndex) {
log.warn("The keyword isn't unique please check your input file '{}'", keywordInJsonFile);
log.warn("Continue with the first match at index {}", index);
}
int indexJsonStart;
int indexJsonStop;
int currentIndex = index;
int bracketCounter = 0;
// loop and find the first '{' from the json Object
while (true) {
currentIndex--;
char c = text.charAt(currentIndex);
if (c == '}') bracketCounter++;
if (c == '{') bracketCounter--;
if (c == '{' && bracketCounter == -1)
{
indexJsonStart = currentIndex;
break;
}
}
currentIndex = index;
bracketCounter = 0;
// loop and find the last '}' from the json Object
while (true) {
currentIndex++;
char c = text.charAt(currentIndex);
if (c == '}') bracketCounter++;
if (c == '{') bracketCounter--;
if (c == '}' && bracketCounter == 1)
{
indexJsonStop = currentIndex +1;
break;
}
}
// Gson -> JsonObject has to be between the { }
return new JsonParser().parse(text.substring(indexJsonStart, indexJsonStop)).getAsJsonObject();
}
I asked me the question: is it possible to regex it?我问了我一个问题:可以正则表达式吗? A Saturday evening later and I don't think so.一个星期六晚上之后,我不这么认为。 I can't figure out how I can formulate the "give me the first open bracket that hasn't ben closed jet" or "give me the first close bracket that hasn't ben opened jet".我不知道如何制定“给我第一个尚未关闭的喷气式飞机的开放式支架”或“给我第一个尚未打开的喷气式飞机的封闭式支架”。 can someone help me out?有人可以帮我吗?
Alternative - regex:替代 - 正则表达式:
"^\\{\n^\\s+\"[^\"]+\":\"[^\"]+\",\n.*?^\\}\n"
See regex in context:在上下文中查看正则表达式:
public static void main(String[] args) {
String input = "dfga gsdgdf fdgdfsgfd asdfgf\n"
+ "AAAA SSSSSS ddddddddd ffffffff ggggggg\n"
+ "{\n"
+ " \"key0\":\"value0\",\n"
+ " \"key1\":\"value0\",\n"
+ " \"key2\":\"value0\",\n"
+ " \"uniqueKey\":\"value0\",\n"
+ " \"key0\":[\n"
+ " {\"key0\":\"value0\",\"key1\":\"value1\"}\n"
+ "\n"
+ " ]\n"
+ "}\n"
+ "dfga gsdgdf fdgdfsgfd asdfgf\n"
+ "BBBB cccccccc ZZZZZZZ xxxxxxxxxxx cccccccccccc\n";
Matcher matcher = Pattern
.compile("^\\{\n^\\s+\"[^\"]+\":\"[^\"]+\",\n.*?^\\}\n"
, Pattern.MULTILINE|Pattern.DOTALL).matcher(input);
while(matcher.find()) {
String result = matcher.group();
//Output
System.out.println(result);
}
}
Output: Output:
{
"key0":"value0",
"key1":"value0",
"key2":"value0",
"uniqueKey":"value0",
"key0":[
{"key0":"value0","key1":"value1"}
]
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.