[英]Splitting strings & Pattern matching in Java
I have a following String: 我有以下字符串:
MYLMFILLAAGCSKMYLLFINNAARPFASSTKAASTVVTPHHSYTSKPHHSTTSHCKSSD
I want to split such a string every time a K
or R
is encountered, except when followed by a P
. 我想每次遇到
K
或R
都拆分这样的字符串,除非后面跟P
Therefore, I want the following output: 因此,我需要以下输出:
MYLMFILLAAGCSK
MYLLFINNAARPFASSTK
AASTVVTPHHSYTSKPHHSTTSHCK
SSD
At first, I tried using simple .split()
function in java but I couldn't get the desired result. 最初,我尝试在Java中使用简单的
.split()
函数,但无法获得所需的结果。 Because I really don't know how to mention it in the .split()
function not to split if there is a P
right after K
or R
. 因为我真的不知道如何在
.split()
函数中提及它,如果在K
或R
后面有一个P
话,不进行拆分。
I've looked at other similar questions and they suggest to use Pattern matching but I don't know how to use it in this context. 我看过其他类似的问题,他们建议使用模式匹配,但是我不知道如何在这种情况下使用它。
You can use split: 您可以使用split:
String[] parts = str.split("(?<=[KR])(?!P)");
Because you want to keep the input you're splitting on, you must use a look behind , which asserts without consuming. 因为您想保留正在分割的输入,所以必须使用look后 ,它断言而不会消耗。 There are two look arounds:
有两种环顾四周:
(?<=[KR])
means "the previous char is either K
or R
" (?<=[KR])
表示“上一个字符为K
或R
” (?!P)
means "the next char is not a P
" (?!P)
表示“下一个字符不是 P
” This regex matches between characters where you want to split. 此正则表达式在您要分割的字符之间匹配。
Some test code: 一些测试代码:
String str = "MYLMFILLAAGCSKMYLLFINNAARPFASSTKAASTVVTPHHSYTSKPHHSTTSHCKSSD";
Arrays.stream(str.split("(?<=[KR])(?!P)")).forEach(System.out::println);
Output: 输出:
MYLMFILLAAGCSK
MYLLFINNAARPFASSTK
AASTVVTPHHSYTSKPHHSTTSHCK
SSD
Just try this regexp: 只需尝试以下正则表达式:
(K)([^P]|$)
and substitute each matching by 并将每个匹配项替换为
\1\n\2
as ilustrated in the following demo . 如以下演示中所示 。 No negative lookahead needed.
无需负面的前瞻。 But you cannot use it with split, as it should eliminate the not
P
character after the K
also. 但是您不能将其与split一起使用,因为它也应该在
K
之后消除not P
字符。
You can do a first transform like the one above, and then .split("\\n");
您可以先执行上述转换,然后执行
.split("\\n");
so it should be: 所以应该是:
"MYLMFILLAAGCSKMYLLFINNAARPFASSTKAASTVVTPHHSYTSKPHHSTTSHCKSSDK"
.subst("(K)([^P]|$)", "\1\n\2").split("\n");
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.