I'm trying to read multidelimter (|,||) csv file by using pyspark sql, am not able read any data from dataframe its giving 0 records
sample data of csv file
Newyork|234567|company Ltd||PIN
df = spark.read.option.("sep","|").option("header","true").load(csv)
I need to read the data, is there any other way to handle this?
Try this-
spark.read
.option("sep", "|")
.option("header", "true")
.csv(spark.read.text("<path>").as(Encoders.STRING).map(_.replaceAll("\\|\\|", "|")))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.