简体   繁体   中英

Kotlin String.split, ignore when delimiter is inside a quote

I have a string:

Hi there, "Bananas are, by nature, evil.", Hey there.

I want to split the string with commas as the delimiter. How do I get the .split method to ignore the comma inside the quotes, so that it returns 3 strings and not 5.

You can use regex in split method

According to this answer the following regex only matches , outside of the " mark

,(?=(?:[^\\"] \\"[^\\"] \\") [^\\"] $)

so try this code:

str.split(",(?=(?:[^\\\"]*\\\"[^\\\"]*\\\")*[^\\\"]*\$)".toRegex())

You can use split overload that accepts regular expressions for that:

val text = """Hi there, "Bananas are, by nature, evil.", Hey there."""
val matchCommaNotInQuotes = Regex("""\,(?=([^"]*"[^"]*")*[^"]*$)""")
println(text.split(matchCommaNotInQuotes))

Would print:

[Hi there,  "Bananas are, by nature, evil.",  Hey there.]

Consider reading this answer on how the regular expression works in this case.

You have to use a regular expression capable of handling quoted values. See Java: splitting a comma-separated string but ignoring commas in quotes and C#, regular expressions : how to parse comma-separated values, where some values might be quoted strings themselves containing commas

The following code shows a very simple version of such a regular expression.

fun main(args: Array<String>) {

    "Hi there, \"Bananas are, by nature, evil.\", Hey there."
            .split(",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)".toRegex())
            .forEach { println("> $it") }

}

outputs

> Hi there
>  "Bananas are, by nature, evil."
>  Hey there.

Be aware of the regex backtracking problem: https://www.regular-expressions.info/catastrophic.html . You might be better off writing a parser.

If you don't want regular expressions:

val s = "Hi there, \"Bananas are, by nature, evil.\", Hey there."
val hold = s.substringAfter("\"").substringBefore("\"")
val temp = s.split("\"")
val splitted: MutableList<String> = (temp[0] + "\"" + temp[2]).split(",").toMutableList()
splitted[1] = "\"" + hold + "\""

splitted is the List you want

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM