Hello im kinda new to regex and have a small, maybe simple question.
I have the given text:
17.11.2020 15:32 typical Pat. seems sleeping
Additional test
17.11.2020 15:32 typical Pat. seems sleeping
Additional test
17.11.2020 15:32 typical Pat. seems sleeping
Additional test
My current regex (\d{2}.\d{2}.\d{4}\s\d{2}:\d{2})\s?(.*)
matches only till sleeping but reates 3 matches correctly. But i need the Additional test
text also in the second group. i tried something like (\d{2}.\d{2}.\d{4}\s\d{2}:\d{2})\s?([,.:\w\s]*)
but now i have only one huge match because the second group takes everything until the end.
How can i match everything until a new line with a date starts and create a new match from there on?
If you are sure there is only one additional line to be matched you can use
(?m)^(\d{2}\.\d{2}\.\d{4}\s\d{2}:\d{2})\s*(.*(?:\n.*)?)
See the regex demo . Details:
(?m)
- a multiline modifier ^
- start of a line (\d{2}\.\d{2}\.\d{4}\s\d{2}:\d{2})
- Group 1: a datetime string \s*
- zero or more whitespaces (.*(?:\n.*)?)
- Group 2: any zero or more chars other than a newline char as many as possible and then an optional line, a newline followed with any zero or more chars other than a newline char as many as possible. If there can be any amount of lines, you may consider
(?m)^(\d{2}\.\d{2}\.\d{4}[\p{Zs}\t]\d{2}:\d{2})[\p{Zs}\t]*(?s)(.*?)(?=\n\d{2}\.\d{2}\.\d{4}|\z)
See this regex demo . Here,
(?m)^(\d{2}\.\d{2}\.\d{4}[\p{Zs}\t]\d{2}:\d{2})
- matches the same as above, just \s
is replaced with [\p{Zs}\t]
that only matches horizontal whitespace [\p{Zs}\t]*
- 0+ horizontal whitespace chars (?s)
- now, .
will match any chars including a newline(.*?)
- Group 2: any zero or more chars, as few as possible (?=\n\d{2}\.\d{2}\.\d{4}|\z)
- up to the leftmost occurrence of a newline, followed with a date string, or up to the end of string. You are using \s
repeatedly using the *
quantifier with the character class [,.:\w\s]*
and \s
also matches newlines and will match too much.
You can just match the rest of the line using (.*\r?\n.*)
which would not match a newline, then match a newline and the next line in the same group.
^(\d{2}.\d{2}.\d{4}\s\d{2}:\d{2})\s?(.*\r?\n.*)
If multiple lines can follow, match all following lines that do not start with a date like pattern.
^(\d{2}\.\d{2}\.\d{4})\s*(.*(?:\r?\n(?!\d{2}\.\d{2}\.\d{4}).*)*)
Explanation
^
Start of the string (
Capture group1 \d{2}\.\d{2}\.\d{4}
Match a date like pattern )
Close group 1 \s*
Match 0+ whitespace chars (Or match whitespace chars without newlines [^\S\r\n]*
) (
Capture group 2
.*
Match the whole line (?:\r?\n(?.\d{2}\.\d{2}\.\d{4}).*)*
Optionally repeat matching the whole line if it does not start with a date like pattern )
Close group 2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.