[英]How to handle comma in a cell in csv file in Spark/Scala
读取csv时如何处理地址单元格中的逗号?
"node_id","name","address","country_codes","countries","sourceID","valid_until","note"
"14000008","","""Les Tattes""; Bursinel; Vaud; Switzerland","CHE","Switzerland","Panama Papers","Through 2015",""
"14000014","",""""Whingate"" Tower Hill Dummer, Nr Basingstoke; Hants RG25 2AL","GBR","United Kingdom","Panama Papers","Through 2015",""
"14000015","","#02-01; 14 MOHAMED SULTAN ROAD; SINGAPORE 238963","SGP","Singapore","Panama Papers","Through 2015",""
你可以使用一些花哨的东西,比如正则表达式:
或者您可以尝试使用带有分隔符的 split() 函数:
scala> val s = "eggs, milk, butter, Coco Puffs"
s: java.lang.String = eggs, milk, butter, Coco Puffs
scala> s.split(",") //split function
res0: Array[java.lang.String] = Array(eggs, " milk", " butter", " Coco Puffs")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.