简体   繁体   中英

Split text using regex (Java / Kotlin) with multiple delimiter

I have a string row with code (2 chars) and name separated by > .

eg. CP >RENATO DE SA , CP >FRAIS . I want to split this row in pairs with code and name .

I have this text:

CT >RUSSO CT >JOSE AQUINO CP >RENATO DE SA CP >FRAIS CF >TAMARA STUCCHI CF >VANESSA JULKOWS CM >CRISTINA LOUSTA CM >HANS KROESCHEL CM >CONCEICAO MACIE CM >AIMEE FRARI CM >JONNY MOREIRA

Desired result:

CT, RUSSO 
CT, JOSE AQUINO 
CP, RENATO DE SA 
CP, FRAIS 
CF, TAMARA STUCCHI 
CF, VANESSA JULKOWS 
CM, CRISTINA LOUSTA 
CM, HANS KROESCHEL 
CM, CONCEICAO MACIE 
CM, AIMEE FRARI 
CM, JONNY MOREIRA

You can split with this regex ( (?=[AZ]{2} >)| >)

import java.util.*

fun main(args: Array<String>) {
    val input = "CT >RUSSO CT >JOSE AQUINO CP >RENATO DE SA CP >FRAIS ...";
    val split = input.split("( (?=[A-Z]{2} >)| >)".toRegex())
    for (i in split.indices step 2) 
       println(split[i] + ", " + split[i + 1])
}

Outputs

CT, RUSSO
CT, JOSE AQUINO
CP, RENATO DE SA
CP, FRAIS
CF, TAMARA STUCCHI
CF, VANESSA JULKOWS
CM, CRISTINA LOUSTA
CM, HANS KROESCHEL
CM, CONCEICAO MACIE
CM, AIMEE FRARI
CM, JONNY MOREIRA

You can check the ideone demo

regex detail :

The regex will match two things ( (?=[AZ]{2} >)| >)

  • (?=[AZ]{2} > space followed by two upper letters then a space then a > sign, but we need the two Upper letters for that we use ?= a positive lookahead
  • | or
  • > a space followed by > sign

You can check the regex demo here

You can do without regex:

replace(" >", ", ").replace(" ","\\\\n");

or (using regex)

replaceAll("\\\\s>", ", ").replaceAll("\\\\s","\\\\n");

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM