I have a dataframe with data as follows.
+---------------+-------+
|category |marks |
+---------------+-------+
|cricket |1.0 |
|tennis |1.0 |
|football |2.0 |
+---------------+-------+
I want to write the above dataframe into a csv file where file name will be created with current timestamp.
generatedDataFrame.write.mode ("append")
.format("com.databricks.spark.csv").option("delimiter", ";").save("./src/main/resources-"+LocalDateTime.now()+".csv")
But this code is not working properly. Giving the following error
java.io.IOException: Mkdirs failed to create file
Is there a better way to achieve this using scala and spark? Also even though I am trying to create the file with timestamp code is creating a directory with the timestamp and inside that directory a csv with data is created with a random name. how can I have the timestamp filename to these csv files instead of creating a directory?
DF.write.csv will always create a folder with the name you specified and places the output csv files in that folder.
If you want single csv file as a output with the name as timestamp then you can use below code:
import java.text.SimpleDateFormat
import java.util.Date
import org.apache.spark.sql._
import org.apache.hadoop.fs.{FileSystem, Path}
val spark = SparkSession.builder().master("local[*]").getOrCreate()
spark.sparkContext.setLogLevel("ERROR")
val fs = FileSystem.get(spark.sparkContext.hadoopConfiguration)
generatedDataFrame.coalesce(1).write.mode("append").csv("./src/main/resources/outputcsv/")
val outFileName = fs.globStatus(new Path("./src/main/resources/outputcsv/part*"))(0).getPath.getName
val timestamp = new SimpleDateFormat("yyyyMMddHHmm").format(new Date())
fs.rename(new Path(s"./src/main/resources/outputcsv/$outFileName"), new Path(s"./src/main/resources/outputcsv/${timestamp}.csv"))
You should be using src/main/resources and not./src/main/resources. You can check the permissions for directory creation from command line. Also, using LocalDateTime.now directly in path will look something like this "2021-03-01T13:39:09.646", not sure if this is what you want or even if it is valid for HDFS paths(chars like [:]), so would suggest to use date-formatting as well.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.