简体繁体中英

Save only the required CSV file using PySpark

原文 2021-06-18 16:30:37 9 2 python/ csv/ pyspark

I am quite new to PySpark, I am trying to read and then save a CSV file using Azure Databricks.

After saving the file I see many other files like "_Committed","_Started","_Success" and finally the CSV file with a totally different name.

I have already checked using DataFrame repartition(1) and coalesce(1) but this only deals when the CSV file itself was partitioned by Spark. Is there anything that can be done using PySpark?

2 answers

You can do the following:

df.toPandas().to_csv(path/to/file.csv)

It will create a single file csv as you expect.

Those are default Log files created when saving from PySpark . We can't eliminate this. Using coalesce(1) you can save in a single file without partition.

how to convert or save a csv file into a txt file using pyspark

How to save a DataFrame as a csv-file using pyspark?

Unable to save a CSV file using PySpark Dataframe on AWS EMR

How to sort csv file and select only required data using python?

Get only the required lines from a csv file using python

How do I copy only the required rows from one csv file to other csv file using python?

How to save a PySpark dataframe as a CSV with custom file name?

Splitting fields from a CSV file using pyspark

Handle JSON objects in CSV File and save to PySpark DataFrame

How to convert a csv file to an avro file using PySpark?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question how to convert or save a csv file into a txt file using pyspark How to save a DataFrame as a csv-file using pyspark? Unable to save a CSV file using PySpark Dataframe on AWS EMR How to sort csv file and select only required data using python? Get only the required lines from a csv file using python How do I copy only the required rows from one csv file to other csv file using python? How to save a PySpark dataframe as a CSV with custom file name? Splitting fields from a CSV file using pyspark Handle JSON objects in CSV File and save to PySpark DataFrame How to convert a csv file to an avro file using PySpark?

Related Tags

Save only the required CSV file using PySpark

Question

2 answers

solution1
0 2021-06-18 18:21:56

solution2
-1 2021-06-18 17:13:27

Save only the required CSV file using PySpark

Question

2 answers

solution1 0 2021-06-18 18:21:56

solution2 -1 2021-06-18 17:13:27

solution1
0 2021-06-18 18:21:56

solution2
-1 2021-06-18 17:13:27