简体   繁体   中英

How to collect different parts of log with regex

I need to collect information from logs, unfortunatelly those information are not placed toghether, but there are other entries between.

For example, I would Like to know who is the parent of birth child. And the log looks like

[Mar-27-2019 20:17:32]*** Started pregnancy for Bella Goth with Vladimir Goth.
[Mar-27-2019 20:17:32]*** Started adoption of Ninon Caron for Jacqueline Leduc and Don Lothario.
[Mar-27-2019 20:17:32]*** Started adoption of Emile François for Marion Boyer and Paolo Rocca.
[Mar-27-2019 20:17:32]Started 4 pregnancies
[Mar-27-2019 20:17:32]*** Started pet pregnancy for Josie with Bartholomiaou A. Bittlebun Senior.
[Mar-27-2019 20:17:32]*** Started pet pregnancy for Blue with Tempête Romeo.
[Mar-27-2019 20:17:32]Started 2 pet pregnancies
[Mar-27-2019 20:17:32]Checking for random marriage
(...)
[Mar-28-2019 09:54:54]Nancy Landgraab delivered 1 baby.
[Mar-28-2019 09:54:54]   Female delivered:
[Mar-28-2019 09:54:54]   * Zélie Landgraab
[Mar-28-2019 09:54:54]Nancy Landgraab delivered 1 baby.
[Mar-28-2019 09:54:54]Bella Goth delivered 1 baby.
[Mar-28-2019 09:54:54]   Female delivered:
[Mar-28-2019 09:54:54]   * Jessica Goth
[Mar-28-2019 09:54:54]Bella Goth delivered 1 baby.

So what I need to collect toghether is:

[Mar-27-2019 20:17:32]*** Started pregnancy for Bella Goth with Vladimir Goth.
[Mar-28-2019 09:54:54]Bella Goth delivered 1 baby.
[Mar-28-2019 09:54:54]   Female delivered:
[Mar-28-2019 09:54:54]   * Jessica Goth
[Mar-28-2019 09:54:54]Bella Goth delivered 1 baby.

There is any simple way to do that in Java?

We can for instance design some expression to see which words might be in a desired line, such as:

^(?=.*(?:delivered|\*\*\*\s+Started\s+pregnancy)).*$

and then we'd collect those lines.

Demo

The expression is explained on the top right panel of this demo if you wish to explore/simplify/modify it.

Test

import java.util.regex.Matcher;
import java.util.regex.Pattern;

final String regex = "^(?=.*(?:delivered|\\*\\*\\*\\s+Started\\s+pregnancy)).*$";
final String string = "[Mar-27-2019 20:17:32]*** Started pregnancy for Bella Goth with Vladimir Goth.\n"
     + "[Mar-27-2019 20:17:32]*** Started adoption of Ninon Caron for Jacqueline Leduc and Don Lothario.\n"
     + "[Mar-27-2019 20:17:32]*** Started adoption of Emile François for Marion Boyer and Paolo Rocca.\n"
     + "[Mar-27-2019 20:17:32]Started 4 pregnancies\n"
     + "[Mar-27-2019 20:17:32]*** Started pet pregnancy for Josie with Bartholomiaou A. Bittlebun Senior.\n"
     + "[Mar-27-2019 20:17:32]*** Started pet pregnancy for Blue with Tempête Romeo.\n"
     + "[Mar-27-2019 20:17:32]Started 2 pet pregnancies\n"
     + "[Mar-27-2019 20:17:32]Checking for random marriage\n"
     + "(...)\n"
     + "[Mar-28-2019 09:54:54]Nancy Landgraab delivered 1 baby.\n"
     + "[Mar-28-2019 09:54:54]   Female delivered:\n"
     + "[Mar-28-2019 09:54:54]   * Zélie Landgraab\n"
     + "[Mar-28-2019 09:54:54]Nancy Landgraab delivered 1 baby.\n"
     + "[Mar-28-2019 09:54:54]Bella Goth delivered 1 baby.\n"
     + "[Mar-28-2019 09:54:54]   Female delivered:\n"
     + "[Mar-28-2019 09:54:54]   * Jessica Goth\n"
     + "[Mar-28-2019 09:54:54]Bella Goth delivered 1 baby.";

final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);

while (matcher.find()) {
    System.out.println("Full match: " + matcher.group(0));
    for (int i = 1; i <= matcher.groupCount(); i++) {
        System.out.println("Group " + i + ": " + matcher.group(i));
    }
}

RegEx Circuit

jex.im visualizes regular expressions:

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM