I have created the parquet file using Spark.
I have need of parquet meta data like file size and number of lines within it. Is there any solution to get this information using Spark library or Java?
You can use Java File API in scala to get the size as
val file = new File("some.parquet")
val fileSize = file.length
This returns the size in bytes you can convert as you want.
If you want the count the records you need to load to spark and get the count. If you want to get the number of lines then
val lineCount = io.Source.fromFile("some.parquet").getLines.size
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.