简体   繁体   中英

Java split regexp when delimiter is part of the data

Sorry if this question has already been solved, or closed but I have been searching for long without an answer.

I have to split lines I am receiving from an external systems, using the ~ delimiter.

I have an issue because some data contain ~~ (~ repeated twice) and in this case the data must not be split.

So if I receive A~B~C~~C~D I want this split back: A, B, C~~C, D

I cannot figure out what regular expression I have to used not to split ~~ .

You can split by

\b~\b

See demo.

https://regex101.com/r/t3D2Jp/1

You can use

(?:^|\b)~(?:$|\b)

if you want to remove trailing ones too

You can use (?<!~)~(?!~) with a negative look-ahead and look-behind for ~ .

Example

String test = "A~B~C~~D~E";
System.out.println(
    Arrays.toString(
        test.split("(?<!~)~(?!~)")
    )
);

Output

[A, B, C~~D, E]

This should also work with more than two consecutive ~ s, eg with "A~B~C~~~D~E" .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM