简体   繁体   中英

Pyspark: how to read a .csv file?

I am trying to read a.csv file that has a strange format.

This is what I am doing

df =  spark.read.format('csv').option("header", "true").option("delimiter", ',').load("muyFile.csv"))
df.show(5)

在此处输入图像描述

I do not understand why the lonlat entry of the third id is transposed. It seems that the file has two different delimiters. Your help would be much appreciated!

your tag field probably contains comma as a value which is treated as the delimiter. enclose your data in quotes or any other quote char(remember to set.option('quote','')) and read the data again. It should work

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM