简体   繁体   中英

Matching smallest possible group java regexp

I am trying to figure out how to get this regex to work the way I need it to properly. Basically I have a bunch of songs with there lyrics. I am looping through each song lyrics to see if they match the search phrase I am looking for and return the length of the string as like a match rating.

for example I have part of the lyrics of one song here:

 "you gave up the love you got & that is that
 she loves me now she loves you not and that where its at"

I am using this regexp to find a match:

 (?mi)(\bShe\b).*(\bloves\b).*(\byou\b)

however it captures this group

 "she loves me now she loves you"

I want to capture the smallest group possible which would just be "she loves you"

How can I make this capture the smallest group possible?

Some of my code below, I take in the phrase and split it into an array and then check to make sure the lyrics contains the word otherwise we can bail out. I then build a string that will become the regex

 static int rankPhrase(String lyrics, String lyricsPhrase){
    //This takes in song lyrics and the phrase we are searching for

    //Split the phrase up into separate words
    String[] phrase = lyricsPhrase.split("[^a-zA-Z]+");

    //Start to build the regex
    StringBuilder regex = new StringBuilder("(?im)"+"(\\" + "b" + phrase[0] + "\\b)");

    //loop through each word in the phrase
    for(int i = 1; i < phrase.length; i++){

        //Check to see if this word exists in the lyrics first
        if(lyrics.contains(phrase[i])){

            //add this to the regex we will search for
            regex.append(".*(\\b" + phrase[i] + "\\b)");

        }else{
            //if the song isn't found return the rank of 
            //-1 this means song doesn't contain phrase
            return -1;
        }

    }

    //Create the pattern
    Pattern p = Pattern.compile(regex.toString());
    Matcher m = p.matcher(lyrics);


    //Check to see if it can find a match
    if(m.find()){

        //Store this match in a string
        String match = m.group();
(\bShe\b)(?:(?!\b(?:she|loves|you)\b).)*(\bloves\b)(?:(?!\b(?:she|loves|you)\b).)*(\byou\b)

You can make use of lookahead here.See demo.

https://regex101.com/r/hE4jH0/11

For java,use

(\\bShe\\b)(?:(?!\\b(?:she|loves|you)\\b).)*(\\bloves\\b)(?:(?!\\b(?:she|loves|you)\\b).)*(\\byou\\b)

Java's regex matcher only works in the forward direction. What you would need to do is iterate over the set of all matches found and choose the shortest one.

在这里你需要使用负向前瞻,

Pattern.compile("\\bShe\\b(?:(?!\\bshe\\b).)*?\\bloves\\b(?:(?!\\b(?:you|loves)\\b).)*\\byou\\b");

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM