[英]writing a csv with column names and reading a csv file which is being generated from a sparksql dataframe in Pyspark
[英]sparksql not reading multidelimter csv file
I'm trying to read multidelimter (|,||) csv file by using pyspark sql, am not able read any data from dataframe its giving 0 records
csv 文件的樣本數據
Newyork|234567|company Ltd||PIN
df = spark.read.option.("sep","|").option("header","true").load(csv)
我需要讀取數據,還有其他方法可以處理嗎?
嘗試這個-
spark.read
.option("sep", "|")
.option("header", "true")
.csv(spark.read.text("<path>").as(Encoders.STRING).map(_.replaceAll("\\|\\|", "|")))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.