I'm trying to split and include based on spaces and non-word characters, except for apostrophes.
I've been able to make it split and include based on spaces and non-word characters, but I can't seem to figure out how to exclude apostrophes from the non-word characters.
This is my current Regex...
str.split("\\s|(?=\\W)");
...which when run on this code sample:
program p;
begin
write('x');
end.
...produces this result:
program
p
;
begin
write
(
'x <!-- This is the problem.
'
)
;
end
.
Which is almost correct, but my goal is to skip the apostrophes so that this is the result:
program
p
;
begin
write
(
'x' <!-- This is the wanted result.
)
;
end
.
UPDATE
As suggested I've tried:
str.split("\\s|(?=\\W)(?<=\\W)");
Which almost works, but does not split all of the special characters correctly:
program
p;
begin
write(
'x'
)
;
end.
Have you tried...
[^\w']
This will match any character that is neither a word character nor an apostrophe. May be simple enough to work depending on your inputs.
If you run a replace operation using [^\\w']
as your regex and \\n\\1\\n
as your replacement string, it should get you close to where you'd like to be.
Treat the apostrophe separately and requiring a preceding non-word:
str.split("\\s+|(?=[^\\w'])|(?<=\\W)(?=')");
See live demo .
作为替代方案,可以扫描\\ b [\\ w'] + \\ b的字符串
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.