简体   繁体   English

Java String Regex拆分并捕获拆分部分

[英]Java String Regex split and capture splitted portion

Following is the string, Card 41: Slot Type : SFC Card 42: Slot Type : PFC Card 43: Slot Type : GFC Operational State : Empty Card 44: Slot Type : KFC Card 45: Slot Type : SFC 以下是字符串,卡41:插槽类型:SFC卡42:插槽类型:PFC卡43:插槽类型:GFC操作状态:空卡44:插槽类型:KFC卡45:插槽类型:SFC

i want to split in a way so that i should have a map of (41,SFC),(42,SFC),(43,GFC),(44,KFC).... currently iam using this regex "\\s*Card\\s*\\d+\\s*:", is it possible to split and capture with the same regex, like i mean i want to split with "\\s*Card\\s*(\\d+)\\s*:" and capture the (\\d+). 我想以某种方式拆分,以便我应该具有(41,SFC),(42,SFC),(43,GFC),(44,KFC)的映射。...目前我正在使用此正则表达式“ \\ s * Card \\ s * \\ d + \\ s *:“,是否可以使用相同的正则表达式进行拆分和捕获,就像我的意思是我要使用” \\ s * Card \\ s *(\\ d +)\\ s *:“进行拆分一样并捕获(\\ d +)。

Here's an example of what you want to achieve. 这是您想要实现的示例。

String input = "Card 41: Slot Type : SFC Card 42: Slot Type : " +
                "PFC Card 43: Slot Type : GFC Operational State : Empty " +
                "Card 44: Slot Type : KFC Card 45: Slot Type : SFC";
//                           | starts with "Card"
//                           |   | any white space
//                           |   |   | group 1: any digits
//                           |   |   |     | any characters, reluctantly
//                           |   |   |     |  | group 2: looking for 3 capital letter characters
Pattern p = Pattern.compile("Card\\s+(\\d+).+?([A-Z]{3})");
Matcher m = p.matcher(input);
// key set of map will be ordered lexicographically
// if you want to retain insertion order instead, use LinkedHashMap
// for better performance, just a HashMap
Map<String, String> map = new TreeMap<String, String>();
// iterate over find
while (m.find()) {
    map.put(m.group(1), m.group(2));
}
System.out.println(map);

Output 输出量

{41=SFC, 42=PFC, 43=GFC, 44=KFC, 45=SFC}

To Tokenize, use Capture Groups 要标记化,请使用捕获组

This regex will parse your string: 此正则表达式将解析您的字符串:

Card (\d+): Slot Type : (\w+)

As you can see in the right pane of the Regex Demo , capture Groups 1 and 2 contain the tuples you want. 如您在Regex演示 右侧窗格中所见,捕获组1和2包含所需的元组。

Sample Java Code 示例Java代码

Here is how to retrieve your tuples: 这是检索元组的方法:

Pattern regex = Pattern.compile("Card (\\d+): Slot Type : (\\w+)");
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
    // The Card
    System.out.println(regexMatcher.group(1));
    // The Slot Type
    System.out.println(regexMatcher.group(2));
} 

Of course instead of printing the values, you can assign them to any data structure you like. 当然,您可以将它们分配给您喜欢的任何数据结构,而不是打印这些值。

Explanation 说明

  • Card matches literal chars Card匹配文字字符
  • (\\d+) captures the number to Group 1 (\\d+)将数字捕获到第1组
  • : Slot Type : matches literal chars : Slot Type :匹配文字字符
  • (\\w+) captures the slot type to Group 2 (\\w+)将插槽类型捕获到组2

Card (\\d+):.+?: ?(\\w+) should do the trick with the global modifier. Card (\\d+):.+?: ?(\\w+)应该使用全局修饰符来解决问题。

  • (\\d+) captures the digits after Card (\\d+)捕获Card之后的数字
  • (\\w+) captures the letters after the second semicolon (\\w+)捕获第二个分号后的字母

Demo on RexEx101 RexEx101上的演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM