简体   繁体   中英

Java Regex Capture Text Between “”" or '''

I have a document I am trying to parse with Java Regex and in it appears text in quotes either """ or ''' so you have:

""" Bla, you're not very nice! """ or:

''' Bla, this 1 isn't a great example '''

I have been trying along the lines of ["""|''']([\\p{Alnum}|\\p{Blank}]+)[\\"""|''']

Assumptions: The text will start and end with either """ or ''' The text could include numbers, letter, blanks and punctuation The body of the text will not include the sequence of three " or three '

Try this pattern: ("""|''').*?\\1

Given:

"""Hello, World!""" some unquoted text """ lorem ipsum ''" dolor """ some more unquoted text '''single quotes'''
''' Bla, this 1 isn't a great example '''

It will match:

  1. """Hello, World!"""
  2. """ lorem ipsum ''" dolor """
  3. '''single quotes'''
  4. ''' Bla, this 1 isn't a great example '''

You can also probably be more specific than .*? but I wasn't sure what characters you meant by "punctuation".

Something like so worked for me:

        Pattern p = Pattern.compile("(\"{3}(.*?)\"{3})|('{3}(.*?)'{3})");
        String s1 = "\"\"\" Bla, you're not very nice! \"\"\"";
        String s2 = "''' Bla, this 1 isn't a great example '''";

        Matcher m1 = p.matcher(s1);
        Matcher m2 = p.matcher(s2);

        if (m1.matches())
        {
            System.out.println(m1.group(2));
        }


        if (m2.matches())
        {               
            System.out.println(m2.group(4));
        }

It would, however, make it simpler to just use 2 regular expressions. The above code yielded the following:

Bla, you're not very nice!

Bla, this 1 isn't a great example

One of the issues with your regular expression is that any text within the square brackets is OR'D , meaning that the Pipe character is useless (as an OR operator). You will need to replace your square brackets with round ones.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM