I don't have experience with regex in java, but I think that we can solve this using regexp and it can be easier than my examples. I have a text with double ||
symbols. Text can looks like:
1) aaa||bbb||ccc
,
2) aaa||||ccc
,
3) ||bbb||ccc
,
4) || ||cccc
|| ||cccc
etc.
I want to extract text after first ||
- bbb , andr after second ||
- ccc . I did:
Pattern p = Pattern.compile("||",Pattern,DOTALL);
String types[] = p.split(stringToParse);
but this is not working when string doesn't have 3 parts.
Second idea is:
Pattern p = Pattern.compile("||",Pattern,DOTALL);
Matcher m= p.matcher(strToParse);
while (m.find()) {
System.out.println(m.group() + " " + m.start() + " " + m.end());
}
then I know when ||
occures and is possible to do substring. Does exist easier and simpler way to solve this problem?
As above People said don't use it for HTML parser.
Pattern p = Pattern.compile("(<br>)\\w*(<br>)");
Matcher m= p.matcher(c);
while (m.find()) {
System.out.println(m.group().replace("<br>", ""));// replace <br>.
}
This:
String[] data = {
"aaa||bbb||ccc",
"aaa||||ccc",
"||bbb||ccc",
"|| ||cccc"
};
for (String string : data) {
String[] split = string.split(Pattern.quote("||"));
System.out.println("0:"+split[0] + ", 1:" + split[1] + " 2:" + split[2]);
}
gives:
0:aaa, 1:bbb 2:ccc
0:aaa, 1: 2:ccc
0:, 1:bbb 2:ccc
0:, 1: 2:cccc
Note the escaping of the regex using Pattern.quote()
, as |
is a special regex characters .
You've misunderstood the docs for split. This will split the string between on stringToParse:
String types[] = between.split(stringToParse);
You probably want to split the string stringToParse on the sentinel between:
String types[] = stringToParse.split(between);
ex:
String s = "a:b:c";
String letters[] = s.split(":");
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.