简体   繁体   English

如何在Java中的两个字符之间提取数据

[英]how do I extract data between two characters in java

String text = "/'Team1 = 6', while /'Team2 = 4', and /'Team3 = 2'";

String[] body = text.split("/|,");
String b1 = body[1];
String b2 = body[2];
String b3 = body[3];

Desired results: 所需结果:

b1 = 'Team1 = 6'
b2 = 'Team2 = 4'
b3 = 'Team3 = 2'

Use regex. 使用正则表达式。 Something like this: 像这样:

String text = "/'Team1 = 6', while /'Team2 = 4', and /'Team3 = 2'";
Matcher m = Pattern.compile("(\\w+\\s=\\s\\d+)").matcher(text); 
// \w+ matches the team name (eg: Team1). \s=\s matches " = " and \d+ matches the score.
while (m.find()){
    System.out.print(m.group(1)+"\n");
}

This prints: 打印:

Team1 = 6 团队1 = 6

Team2 = 4 小组2 = 4

Team3 = 2 团队3 = 2

There's a few ways you can do this, but in your case I'd use regex . 有几种方法可以执行此操作,但是在您的情况下,我将使用regex

I don't know Java but think something like this regex pattern should work: 我不懂Java,但认为像这样的正则表达式模式应该可以工作:

Pattern compile("\\/'(.*?)'")

A random regex tester site with this pattern is here: https://regex101.com/r/MCRfMm/1 带有此模式的随机正则表达式测试器网站位于: https : //regex101.com/r/MCRfMm/1

I'm going to say "friends don't let friends use regex" and recommend parsing this out. 我要说“朋友不要让朋友使用正则表达式”,并建议对此进行解析。 The built-in class StreamTokenizer will handle the job. 内置类StreamTokenizer将处理该作业。

   private static void testTok( String in ) throws Exception {
      System.out.println( "Input: " + in );
      StreamTokenizer tok = new StreamTokenizer( new StringReader( in ) );
      tok.resetSyntax();
      tok.wordChars( 'a', 'z' );
      tok.wordChars( 'A', 'Z' );
      tok.wordChars( '0', '9' );
      tok.whitespaceChars( 0, ' ' );
      String prevToken = null;
      for( int type; (type = tok.nextToken()) != StreamTokenizer.TT_EOF; ) {
//         System.out.println( tokString( type ) + ":  nval=" + tok.nval + ", sval=" + tok.sval );
         if( type == '=' ) {
            tok.nextToken();
            System.out.println( prevToken + "=" + tok.sval );
         }
         prevToken = tok.sval;
      }
   }

Output: 输出:

Input: /'Team1 = 6', while /'Team2 = 4', and /'Team3 = 2'
Team1=6
Team2=4
Team3=2
BUILD SUCCESSFUL (total time: 0 seconds)

One advantage of this technique is that the individual tokens like "Team1", "=" and "6" are all parsed separately, whereas the regex presented so far is already complex to read and would have to be made even more complex to isolate each of those tokens separately. 这种技术的一个优点是,像“ Team1”,“ =”和“ 6”之类的各个标记都被分别解析,而到目前为止展示的正则表达式已经很复杂了,要隔离每个正则表达式必须变得更加复杂。这些令牌分别。

您可以在“一个斜杠上进行分割,还可以在前面加一个逗号,后跟零个或多个非斜杠字符”:

String[] body = text.split("(?:,[^/]*)?/");
public class MyClass {
    public static void main(String args[]) {
        String text = "/'Team1 = 6', while /'Team2 = 4', and /'Team3 = 2'";
        char []textArr = text.toCharArray();
        char st = '/';
        char ed = ',';


        boolean lookForEnd = false;
        int st_idx =0;
        for(int i =0; i < textArr.length; i++){
            if(textArr[i] == st){
                st_idx = i+1;
                lookForEnd = true;
            }
            else if(lookForEnd && textArr[i] == ed){
                System.out.println(text.substring(st_idx,i));
                lookForEnd = false;
            }
        }

        // we still didn't find ',' therefore print everything from lastFoundIdx of '/'
        if(lookForEnd){
           System.out.println(text.substring(st_idx)); 
        }

    }  
}

/*

'Team1 = 6'
'Team2 = 4'
'Team3 = 2'

*/

You could use split and a regex using an alternation matching either the start of the string followed by a forward slash or matching a comma, match not a comma one or more times and then a forward slash followed by a positive lookahead to assert that what follows the alternation is a ' 您可以使用拆分,并使用正则表达式交替相匹配的字符串,后跟正斜杠或匹配一个逗号与开始,一次或多次匹配不是一个逗号,然后一个正斜杠之后是积极的预测先行断言,接下来交替是'

(?:^/|,[^,]+/)(?=')

Explanation 说明

  • (?: Start non capturing group (?:启动非捕获组
    • ^/ Assert the start of the string followed by forward slash ^/字符串的开头,后跟正斜杠
    • | Or 要么
    • ,[^,]+/ Match a comma followed by match not a comma one or more times using a negated character class and then match a forward slash ,[^,]+/使用一个否定的字符类匹配一个逗号,然后匹配一个或多次而不是逗号,然后匹配一个正斜杠
    • (?=') Positive lookahead to assert what follows is ' (?=')肯定地断言以下是'
  • ) Close non capturing group )关闭非捕获组

Regex demo - Java demo 正则表达式演示 -Java演示

Getting a match instead of split 得到比赛而不是分裂

If you want to to match a pattern like 'Team1 = 6' , you could use: 如果要匹配'Team1 = 6'类的模式,则可以使用:

'[^=]+=[^']+'

Regex demo - Java demo 正则表达式演示 -Java演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM