简体   繁体   中英

how do I extract data between two characters in java

String text = "/'Team1 = 6', while /'Team2 = 4', and /'Team3 = 2'";

String[] body = text.split("/|,");
String b1 = body[1];
String b2 = body[2];
String b3 = body[3];

Desired results:

b1 = 'Team1 = 6'
b2 = 'Team2 = 4'
b3 = 'Team3 = 2'

Use regex. Something like this:

String text = "/'Team1 = 6', while /'Team2 = 4', and /'Team3 = 2'";
Matcher m = Pattern.compile("(\\w+\\s=\\s\\d+)").matcher(text); 
// \w+ matches the team name (eg: Team1). \s=\s matches " = " and \d+ matches the score.
while (m.find()){
    System.out.print(m.group(1)+"\n");
}

This prints:

Team1 = 6

Team2 = 4

Team3 = 2

There's a few ways you can do this, but in your case I'd use regex .

I don't know Java but think something like this regex pattern should work:

Pattern compile("\\/'(.*?)'")

A random regex tester site with this pattern is here: https://regex101.com/r/MCRfMm/1

I'm going to say "friends don't let friends use regex" and recommend parsing this out. The built-in class StreamTokenizer will handle the job.

   private static void testTok( String in ) throws Exception {
      System.out.println( "Input: " + in );
      StreamTokenizer tok = new StreamTokenizer( new StringReader( in ) );
      tok.resetSyntax();
      tok.wordChars( 'a', 'z' );
      tok.wordChars( 'A', 'Z' );
      tok.wordChars( '0', '9' );
      tok.whitespaceChars( 0, ' ' );
      String prevToken = null;
      for( int type; (type = tok.nextToken()) != StreamTokenizer.TT_EOF; ) {
//         System.out.println( tokString( type ) + ":  nval=" + tok.nval + ", sval=" + tok.sval );
         if( type == '=' ) {
            tok.nextToken();
            System.out.println( prevToken + "=" + tok.sval );
         }
         prevToken = tok.sval;
      }
   }

Output:

Input: /'Team1 = 6', while /'Team2 = 4', and /'Team3 = 2'
Team1=6
Team2=4
Team3=2
BUILD SUCCESSFUL (total time: 0 seconds)

One advantage of this technique is that the individual tokens like "Team1", "=" and "6" are all parsed separately, whereas the regex presented so far is already complex to read and would have to be made even more complex to isolate each of those tokens separately.

您可以在“一个斜杠上进行分割,还可以在前面加一个逗号,后跟零个或多个非斜杠字符”:

String[] body = text.split("(?:,[^/]*)?/");
public class MyClass {
    public static void main(String args[]) {
        String text = "/'Team1 = 6', while /'Team2 = 4', and /'Team3 = 2'";
        char []textArr = text.toCharArray();
        char st = '/';
        char ed = ',';


        boolean lookForEnd = false;
        int st_idx =0;
        for(int i =0; i < textArr.length; i++){
            if(textArr[i] == st){
                st_idx = i+1;
                lookForEnd = true;
            }
            else if(lookForEnd && textArr[i] == ed){
                System.out.println(text.substring(st_idx,i));
                lookForEnd = false;
            }
        }

        // we still didn't find ',' therefore print everything from lastFoundIdx of '/'
        if(lookForEnd){
           System.out.println(text.substring(st_idx)); 
        }

    }  
}

/*

'Team1 = 6'
'Team2 = 4'
'Team3 = 2'

*/

You could use split and a regex using an alternation matching either the start of the string followed by a forward slash or matching a comma, match not a comma one or more times and then a forward slash followed by a positive lookahead to assert that what follows the alternation is a '

(?:^/|,[^,]+/)(?=')

Explanation

  • (?: Start non capturing group
    • ^/ Assert the start of the string followed by forward slash
    • | Or
    • ,[^,]+/ Match a comma followed by match not a comma one or more times using a negated character class and then match a forward slash
    • (?=') Positive lookahead to assert what follows is '
  • ) Close non capturing group

Regex demo - Java demo

Getting a match instead of split

If you want to to match a pattern like 'Team1 = 6' , you could use:

'[^=]+=[^']+'

Regex demo - Java demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM