I'm trying to figure out how to come up with one regex that supports the following 2 use cases:
Use Case 1:
-- File 1 (input) --
keepthis
junkhere:
this should be removed
Use Case 2:
-- File 2 (input) --
keepthis
------------
junkhere:
this should be removed
Essentially I'm building one regex to remove everything from "junkhere:" and down. However, in use case 2 there is an optional "------------" that gets included on the line before "junkhere:" sometimes but not always (not sure of the exact of -'s).
Output should be:
-- File 3 (output) --
keepthis
I have the following regex and it works for use case 1 but not for use case 2:
Pattern JUNKHERE_REGEX = Pattern.compile("^(((-+)(.*))?junkhere:(.*))$", Pattern.MULTILINE | Pattern.DOTALL);
Matcher m = JUNKHERE_REGEX.matcher(<input from either file1 or file2>);
if (m.find()) || (n.find() || (o.find()) { // there could be other matchers here n and o in this case so I would like to keep the replaceall code below the same so I don't have to create a new if statement
text = m.replaceAll("");
text = text.replaceAll("[\n]+$", ""); // replace and delete any newlines
}
System.out.println(text); // should echo "keepthis"
I'm not that good with regex's but what do I need to make this work for use case 2 (and use case 1)?
Thanks!
Replace match of [\\n\\r]+(?:[-]+[\\n\\r]+)?\\s*junkhere:\\s*[\\n\\r][\\s\\S]*
with empty string.
Test it here: http://regexr.com?37edu and here: http://regexr.com?37ee1
In Java you have to double escape characters:
= text.replaceAll("[\\n\\r]+(?:[-]+[\\n\\r]+)?\\s*junkhere:\\s*[\\n\\r][\\s\\S]*", "");
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.