简体   繁体   English

如何获取java中源字符串中大括号内的字符串数组

[英]how to get the array of strings that are inside curly braces in the source string in java

There is a string object in java with its contents as: java中有一个字符串对象,其内容为:

String sourceString="This {is} a sample {string} that {contains} {substrings} inside curly {braces}";

I want the array of string with its contents as: {is},{string},{contains},{substrings}{braces} 我希望字符串数组的内容为: {is},{string},{contains},{substrings}{braces}

Following is the code that I wrote to get the result but the output I am getting is: 以下是我为获得结果而编写的代码,但我得到的输出是:

"{is} a sample {string} that {contains} {substrings} inside curly {braces}"

So, basically it is taking all the characters that are in between the first open curly braces and last closing curly braces. 所以,基本上它是在第一个打开的花括号和最后一个花括号之间的所有字符。

// Source string
String sourceString="This {is} a samle {string} that {contains} {substrings} inside curly {braces}";

// Regular expression to get the values between curly braces (there is a mistake, I guess)
String regex="\\{(.*)\\}";
Matcher matcher = Pattern.compile(regex).matcher(sourceString);

while (matcher.find()) {
    System.out.println(matcher.group(0));
}

A little bit of Googling found this solution, which gave me ideas for the pattern 一点点谷歌搜索找到了这个解决方案,它给了我关于模式的想法

Some time spent with Lesson: Regular Expressions brought the libraries and functionality I would need to provide this example... 有一些时间用于Lesson:Regular Expressions带来了我需要提供这个示例的库和功能......

String exp = "\\{(.*?)\\}";

String value = "This {is} a samle {string} that {contains} {substrings} inside curly {braces}";

Pattern pattern = Pattern.compile(exp);
Matcher matcher = pattern.matcher(value);

List<String> matches = new ArrayList<String>(5);
while (matcher.find()) {
    String group = matcher.group();
    matches.add(group);
}

String[] groups = matches.toArray(new String[matches.size()]);
System.out.println(Arrays.toString(groups));

Which outputs 哪个输出

[{is}, {string}, {contains}, {substrings}, {braces}]

Here's the one line solution: 这是一线解决方案:

String[] resultArray = str.replaceAll("^[^{]*|[^}]*$", "").split("(?<=\\})[^{]*");

This works by first stripping off the leading and trailing junk, then splitting on everything between } and { . 这通过首先剥离前导和尾随垃圾,然后拆分}{之间的所有内容来实现。


Here's some test code: 这是一些测试代码:

String str = "This {is} a samle {string} that {contains} {substrings} inside curly";
String[] resultArray = str.replaceAll("^[^{]*|[^}]*$", "").split("(?<=\\})[^{]*");
System.out.println(Arrays.toString(resultArray));

Output: 输出:

[{is}, {string}, {contains}, {substrings}]
  • Pattern which will match {characters} can look like \\\\{[^}]*\\\\} . 匹配{characters}可能看起来像\\\\{[^}]*\\\\}
  • Now using Pattern and Matcher classes you can find each substring that matches this regex. 现在使用PatternMatcher类,您可以找到与此正则表达式匹配的每个子字符串。
  • Place each of founded substring in List<String> 将每个已创建的子字符串放在List<String>
  • After list is filled with all substrings you can convert it to array using yourList.toArray(newStringArray) method. 在列表填充了所有子字符串后,您可以使用yourList.toArray(newStringArray)方法将其转换为数组。

EDIT after your update 更新后编辑

Problem with your regex is that * quantifier is greedy which means it will try to find maximal possible solution. 你的正则表达式的问题是*量词是贪婪的,这意味着它会试图找到最大可能的解决方案。 So in case of \\\\{(.*)\\\\} it will match 因此,在\\\\{(.*)\\\\} ,它将匹配

  • first possible { , 第一个可能{
  • zero or more characters 零个或多个字符
  • last possible } which in case of 最后可能}以防万一

     This {is} a samle {string} that {contains} {substrings} inside curly {braces} 

means it will start from { in {is} and finis in } from {braces} 意味着它将从{braces} { in {is}和finis in }开始

To make * find minimal set of characters which can be used to create matching substring you need to either 要使*找到可用于创建匹配子字符串的最小字符集,您需要

  • add ? ? after it making *? 制作完成后*? quantifier reluctant, 量词不情愿,
  • describe your regex as I did originally and exclude } from possible match between { and } , so instead of matching any characters which represents . 描述你的正则表达式,像我一样原本并排除}从可能的匹配之间{}所以,与其匹配其代表任意字符. use [^}] which represents any character except } . 使用[^}]表示除}之外的任何字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM