简体   繁体   中英

Regex “reluctant” quantifier misbehaves

I've got this Java code, part of a LaTex songbook project.

Pattern p = Pattern.compile("\\\\retitle\\{(.*?)\\}",Pattern.DOTALL);
Matcher m = p.matcher(in);
System.out.println(m.matches());
System.out.println(m.group(1));

Given this input:

\retitle{Livin' on a prayer}{Bon Jovi}
\begin{song}\begin{vers}[Em]Johnie used to work on the docks\newline
Saving up his money I don't know these l[C]yrics\newline
l[D]ol.\newline
\end{vers}
\end{song}

I'm expecting this output:

true
Livin' on a prayer

But I actually get this:

true
Livin' on a prayer}{Bon Jovi}
\begin{song}\begin{vers}[Em]Johnie used to work on the docks\newline
Saving up his money I don't know these l[C]yrics\newline
l[D]ol.\newline
\end{vers}
\end{song

In other words, the *? quantifier is not as "reluctant" as I expect. What am I doing wrong?

The problem is not in your regex, but in the method you're using: Matcher.matches() tries to match the pattern against the entire input. The reluctance of the quantifier doesn't really get a chance to be relevant, because your input-string can match the pattern in only one way.

Instead, you need to use Matcher.find() , which tries to find a substring of the input that matches the pattern.

See the Javadoc for Matcher for more information.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM