简体   繁体   中英

Java using regex to match a pattern for quizzes

I am trying to do one of the 100 mega list projects. One of them is about a quiz maker that parses through a file of quiz questions, picks some of them out at random, creates a quiz and also grades quizzes.

I am trying to do the part of simply loading in the quiz questions and parsing them out individually (ie 1 question and its multiple choice answers as an entity).

The format of the quiz is as follows:

Intro to Computer Science


    1. Which of the following accesses a variable in structure b?
    A. b->var
    B. b.var
    C. b-var
    D. b>var

    2. Which of the following accesses a variable in a pointer to a structure, *b?
    A. b->var
    B. b.var
    C. b-var
    D. b>var

    3. Which of the following is a properly defined struct?
    A. struct {int a;}
    B. struct a_struct {int a;}
    C. struct a_struct int a
    D. struct a_struct {int a;}

    4. Which properly declares a variable of struct foo?
    A. struct foo
    B. foo var
    C. foo
    D. int foo

Of course there are many of these questions but they are all in the same format.Now I used BufferedReader to load in these questions into a string and am attempting to use regex to parse them. But I am unable to match on any specific part. Below is my code:

    package myPackage;
    import java.io.*;
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;

public class QuizMaker {

    public static void main(String args[])
    {


        String file = "myfile/QuizQuestions.txt";
        StringBuilder quizLine = new StringBuilder();
        String line = null;

        try {
            FileReader reader = new FileReader(file);

            BufferedReader buffreader = new BufferedReader(reader);



            while ((line = buffreader.readLine()) != null)
            {
                quizLine.append(line);
                quizLine.append("\n");
            }

            buffreader.close();

        } catch (FileNotFoundException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
          catch (IOException e1) {

              e1.printStackTrace();
        }


        System.out.println(quizLine.toString());


        Pattern pattern = Pattern.compile("^[0-9]{1}.+\\?");
        Matcher matcher = pattern.matcher(quizLine.toString());

        boolean didmatch = matcher.lookingAt();
        System.out.println(didmatch);

        String mystring = quizLine.toString();

        int start = matcher.start();
        int end = matcher.end();

        System.out.println(start + " " + end);

        char a = mystring.charAt(0);
        char b = mystring.charAt(6);

        System.out.println(a + " " + b);



    }



}

At this point, I am simply trying to match on the questions themselves and leave the multiple choice answers till I solve this part. Is it due to my regex pattern being wrong? I tried to even match on a simple number itself and even that was failing (via "^[0-9]{1}").

Am I doing something completely wrong? One other question I had was that this simply was returning one match, not all of them. How exactly would you iterate through the string to find all matches? Any help would be appreciated.

I personally wouldn't use a regex, I would just use a StringTokenizer on the \\n, and just check if the first character is a numeric (since no other lines seem to start with a number).

But to more specifically answer your question. You need to specify the MULTILINE flag on your pattern for ^ and $ to match the start and end of lines.

Pattern pattern = Pattern.compile("^[0-9]{1}.+\\?", Pattern.MULTILINE);

This should allow your pattern to match lines within the text. Otherwise ^ and $ just match the start and end of the string as a whole.

Description

This expression will capture the entire question followed by all the possible answers providing the string is roughly formatted like your sample text

^\\s*(\\d+\\.\\s+.*?)(?=[\\r\\n]+^\\s*\\d+\\.|\\Z)

在此输入图像描述

Example

Live Example: http://www.rubular.com/r/dcetgPsz5w

Given Sample Text

Intro to Computer Science


    1. Which of the following accesses a variable in structure b?
    A. b->var
    B. b.var
    C. b-var
    D. b>var

    2. Which of the following accesses a variable in a pointer to a structure, *b?
    A. b->var
    B. b.var
    C. b-var
    D. b>var



    3. Which of the following is a properly defined struct?
    A. struct {int a;}
    B. struct a_struct {int a;}
    C. struct a_struct int a
    D. struct a_struct {int a;}

    4. Which properly declares a variable of struct foo?
    A. struct foo
    B. foo var
    C. foo
    D. int foo

Capture Group 1 Matches

[0] => 1. Which of the following accesses a variable in structure b?
A. b->var
B. b.var
C. b-var
D. b>var
[1] => 2. Which of the following accesses a variable in a pointer to a structure, *b?
A. b->var
B. b.var
C. b-var
D. b>var
[2] => 3. Which of the following is a properly defined struct?
A. struct {int a;}
B. struct a_struct {int a;}
C. struct a_struct int a
D. struct a_struct {int a;}
[3] => 4. Which properly declares a variable of struct foo?
A. struct foo
B. foo var
C. foo
D. int foo

If you yse String.matches() , you need only a fraction of the code you are cutrently attempting to use.

To test if a line is a question:

if (line.matches("\\s*\\d\\..*"))

To test if a line is an answer:

if (line.matches("\\s*[A-Z]\\..*"))
  1. In the code, quizLine is like "1. Which of the following accesses a variable in structure b?\\nA. b->var\\nB. b.var\\n...". The pattern "^[0-9]{1}.+\\?" will try to match the whole string, which is not correct.
  2. The simple way to do that is quizLine.split and the match it line by line
  3. Another way is as @Denomales and @Chase described, use multiple line match, and get match groups.
  4. As @Bohemian said, String#matches is a good shortcut to check if string matches, but could not get match groups. If you need Matcher, be noted that Matcher#lookingAt is a little different from Matcher#matches. Matcher#matches may be better in your case.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM