简体   繁体   中英

Java - match specific URL in string

There must be something very simple that I'm missing here. I'm trying to match an exact URL in a given string. Here's the code :

String pattern = "\\b.*"+"\"http://fonts.googleapis.com/css?family=Montserrat:400,700\""+"\\b";
Pattern p=Pattern.compile(pattern);
Matcher m=p.matcher("<link href=\"http://fonts.googleapis.com/css?family=Montserrat:400,700\"");
System.out.println(m.find()); // returns false

But the same code works when I try for local resources :

pattern = "\\b.*"+"style.css"+"\\b";
p=Pattern.compile(pattern);
m=p.matcher("<link href=\"css/style.css\"");
System.out.println(m.find()); // returns true

You are missing the fact that in the URL you try and match, there is a question mark. And the question mark is a quantifier, which means it is treated specially by the regex engine (this quantifier means "zero or more of the previously recognized atom").

You do not want that question mark to be interpreted; which means your regex should be built differently... And there is a way:

final String quotedUrl 
    = Pattern.quote("http://fonts.googleapis.com/css?family=Montserrat:400,700");
final String regex = "\\b\"" + quotedUrl + "\"\\b";
final Pattern pattern = Pattern.compile(regex);
// work with the regex

Note that in fact, Pattern.quote() only ever surrounds your input with the regex special sequences \\Q and \\E . And those were borrowed from perl, unsurprisingly, since perl regexes have been the lingua franca of all successful regex engines so far.

Unless you have some other intention for this, the word boundary assertions and use of regex seem irrelevant here. I would suggest just using a non-regex solution using contains or indexOf.

String url = "http://fonts.googleapis.com/css?family=Montserrat:400,700";
String src = "<link href=\"http://fonts.googleapis.com/css?family=Montserrat:400,700\"";
System.out.println(src.contains(url));

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM