简体   繁体   中英

How to escape delimiter in data?

I have a list of titles that I want to save as a String :
- title1
- title2
- title|3

Now, I want to save this as a single line String delimited by | , which would mean it ends up like this: title1|title2|title|3 .

But now, when I split the String :

String input = "title1|title2|title|3";
String[] splittedInput = input.split("\\|");

splittedInput will be the following array: {"title1", "title2", "title", "3"} .

Obviously, this is not what I want, I want the third entry of the array to be title|3 .

Now my question: how do I correctly escape the | in the titles so that when I split the String I end up with the correct array of three titles, instead of 4?


@Gábor Bakos

Running this code snippet:

String input = "title1|title2|title\\|3";
String[] split = input.split("(?<!\\\\)\\|");

for (int i = 0; i < split.length; i++) {
    split[i] = split[i].replace("\\\\(?=\\|)", "");
}
System.out.println(Arrays.toString(split));

I get this output: [title1, title2, title\\|3] . What am I doing wrong?

You can use anything. For example with \\ :

 "title1|title2|title\\|3".split("(?<!\\\\)\\|").map(_.replaceAll("\\\\(?=\\|)", "")) //Scala syntax

Resulting:

  Array(title1, title2, title|3)

The final mapping is required to remove the escaping character too.

(?<!\\\\\\\\) is look behind , (?=\\\\|) is an extra look-ahead for the escaped | .

Well if you use a TSV format the chosen separator must never be left unescaped in the data .

You could simply escape your data (for ex, title1|title2|title\\|3 ) and you would then split on (?<!\\\\)| (negative lookbehind).

In Java, it gives:

public static void main(String[] args) {
    // prints out [title1, title2, title|3, title|4]
    System.out.println(parsePipeSeparated("title1|title2|title\\|3|title\\|4"));
}

private static List<String> parsePipeSeparated(String input) {
    return Stream.of(input.split("(?<!\\\\)\\|"))
                 .map(escapedText -> escapedText.replace("\\|", "|"))
                 .collect(Collectors.toList());
}

Use another separator, for instance "title1,title2,title|3", instead of "title1|title2|title|3". And then split(",")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM