I want to replace "&" with a random word "$d" in a given sentence. Can we replace only those words which start with & and are followed by a single character and a space?
Example:-
Input:-
Two literals are &a and &b and also check &abc and &bac here.
Output:-
Two literals are $da and $db and also check &abc and &bac here.
In the above example in input, the only words that should be replaced are &a and &b(not the complete word should be replaced, only just the '&' in both the words) because these two random words start with & and are followed by a single character and a space.
In the case of the replaceAll() function, it replaces the entire word when I used regex:-
String str="Two literals are &a and &b and also check &abc and &bac here.";
str = str.replaceAll("\\&[a-zA-Z]{1}\\s", "\\$d");
System.out.println(str);
//output for this:-Two literals are $d and $d and also check &abc and &bac here.
//expected output:-Two literals are $da and $db and also check &abc and &bac here.
The correct code for this would be
str.replaceAll("&([a-zA-Z]\\s)", "\\$d$1")
This is an example of backreferencing captured groups in regex, and a here is a nice reference for it . Additionally, here's a relevant StackOverflow question about it .
Essentially, the match inside the parentheses ( [a-zA-Z]\\s
) matches a single letter and a space. The value of this match can be referenced with $1
since it is of capturing group 1.
So we replace &(a )
with $d(a )
(brackets here to demonstrate what is captured). Credit to u/rzwitserloot for reminding me that OP wants $ not &.
You presumably want a concept called look-ahead: You can match on things being there without 'consuming' it. You can even match on things NOT being there. That's what you want here: Match &[az]
, but only if looking ahead past that, we do NOT see another letter:
for (String test : List.of("Two literals are &a and &bcd", "A literal is &a", "How about &a?")) {
System.out.println(str.replaceAll("&(?=[a-zA-Z](?![a-zA-Z]))", "\\$d"));
}
Perhaps instead you want the single letter thing to just be on any word break (ie &z00
should NOT turn into $dz00
, even though there is no letter after the z
. Then I suggest:
"&(?=[a-zA-Z]\\b)"
That's a lot simpler to read!
A few notes:
(?=x)
is 'positive lookahead'. It doesn't itself match anything but makes the match fail if x
is not immediately following the match.(?!x)
is 'negative lookahead'. It doesn't itself match anything but makes the match fail if x
is immediately following the match.$
has special meaning in the replacement part so we need to escape it. \\b
is regexpese for 'word break': Doesn't match any characters, but fails if we aren't on a 'word break'. Spaces, dots, end-of-input, end-of-line, a dash, an ampersand - many things are word breaks.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.