简体   繁体   中英

RegEx with pattern and matcher in JAVA

I have a text/html file i need some information from it. Therefor i use RegEx like in the code below.

My Problem is, that I want the results of pattern p and pattern l in the same matcher because the order of the results is very important. In my code the System.output has a wrong order because he prints the results of pattern p and then the results of pattern l.

How to solve that Problem?

        String pattern1 = "<img class=\"galleryElement shown\" data-src=\"";
        String pattern2 = "\" src=\"\" />";
        String pattern3 = "<img class=\"galleryElement shown\" src=\"";
        String pattern4 = "\" />";
        Pattern p = Pattern.compile(Pattern.quote(pattern1) + "(.*?)" + Pattern.quote(pattern2));
        Pattern l = Pattern.compile(Pattern.quote(pattern3) + "(.*?)" + Pattern.quote(pattern4));
        Matcher m = p.matcher(res.toString());
        while (m.find()) {
            System.out.println(m.group(1));
        }

        Matcher n = l.matcher(res.toString());
        while (n.find()) {
            System.out.println(n.group(1));
        }

Since you have not defined the rules hence it's just a try. Let me know if you need some any modification in the pattern:

<img class="galleryElement shown" (data-)?src="([^"]*?)"

Online demo

Pattern explanation: (data-)?src="([^"]*?)"

  (                        group and capture to \1 (optional):
    data-                    'data-'
  )?                       end of \1
  src="                    'src="'
  (                        group and capture to \2:
    [^"]*?                    any character except: '"' (0 or more times (least))
  )                        end of \2
  "                        '"'

Sample code:

String pattern = "<img class=\"galleryElement shown\" (data-)?src=\"([^\"]*?)\"";
Pattern p = Pattern.compile(pattern,Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher("<img class=\"galleryElement shown\" data-src=\"abc\" />");
while (m.find()) {
    System.out.println(m.group(2));
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM