I use this code to get html source code and the information I wanted. I was just testing if it will return me < and "!" for the first line. However, this doesn't work!
import java.io.*;
import java.net.URL;
import java.util.regex.*;
public class url
{
public static BufferedReader read(String url) throws Exception {
return new BufferedReader(
new InputStreamReader(
new URL(url).openStream()));
}
public static void main (String[] args) throws Exception{
BufferedReader reader = read(args[0]);
String line = reader.readLine();
while(line != null) {
System.out.println(line);
line = reader.readLine();
regex("//<//!",line);
}
}
public static void regex(String regex, String check){
Pattern checkregex =Pattern.compile(regex);
Matcher regexMatcher = checkregex.matcher(check);
if(regexMatcher.find()==false)
return;
while(regexMatcher.find()){
if(regexMatcher.group().length() !=0) {
System.out.println(regexMatcher.group().trim());
}
}
}
}
That's because you've confused backslashes \\
with forward-slashes /
. The former are what's used for escaping special characters. So, change this:
regex("//<//!",line);
to this:
regex("\\<\\!",line);
That said, <
and !
don't actually have any special meaning in this context, so you can just write:
regex("<!",line);
if you prefer.
Also, note that the above regex matches the two-character substring <!
. Something about your question makes me think that you might actually be wanting to match the one-character substrings <
and !
separately? If so, you can either use the ...|...
syntax for specifying multiple alternative patterns:
regex("<|!",line); // matches whatever matches < or matches !
or the [...]
syntax for specifying a class of characters:
regex("[<!]",line); // matches a character that is either < or !
(in this circumstance, these two syntaxes are equivalent).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.