在Python中创建CSV文件时如何不打印引号字符

Question

I have a CSV file that I'm creating in Azure Databrick using Python. 我有一个使用Python在Azure Databrick中创建的CSV文件。 This is taking a data frame and generating a CSV file from it. 这是在获取数据帧并从中生成CSV文件。 The problem is when there is an empty value in the data frame the output is 2 double quotes, ie "", 问题是，当数据帧中有一个空值时，输出为2个双引号，即“”，

Example Output 示例输出

L1Code  L1 Desc1    L1 Desc2    L1 Desc3    L2Code
Beverage    Beverage    ""  ""  Drink Blends

This is the code that I'm using to generate the file, where df is a Pandas dataframe that has already been created. 这是我用来生成文件的代码，其中df是已经创建的Pandas数据框。

from pyspark.sql import SQLContext

def createCsvFile(data, rootPath, filePath):
  data.coalesce(1).write.mode("overwrite").format("com.databricks.spark.csv").option("header", "true").option("delimiter", "\t").option("quoteMode", "NONE").csv(rootPath + filePath + ".tmp")

  fileList = dbutils.fs.ls(rootPath + filePath + ".tmp/")

  for file in fileList:
    if file.name.endswith("csv"):
      filename = file.path
      dbutils.fs.cp(filename, rootPath + filePath + ".txt")

  dbutils.fs.rm(rootPath + filePath + ".tmp", recurse=True)


sqlCtx = SQLContext(sc)
data = sqlCtx.createDataFrame(df)
createCsvFile(data, '/mnt/adlsdata/Raw/Astute/', 'products')

Answer 1

我最终需要使用emptyValue选项来使ti工作

  data.coalesce(1).write.mode("overwrite").format("com.databricks.spark.csv").option("header", "true").option("delimiter", "\t").option("quoteMode", "NONE").option("quote", u'\u0000').option("nullValue", "").option("emptyValue", "").csv(rootPath + filePath + ".tmp")

在Python中创建CSV文件时如何不打印引号字符

问题描述

1 个解决方案

解决方案1
0 2019-07-11 13:31:58

在Python中创建CSV文件时如何不打印引号字符

问题描述

1 个解决方案

解决方案1 0 2019-07-11 13:31:58

解决方案1
0 2019-07-11 13:31:58