简体   繁体   中英

Not able to read delta parquet files inside a container from storage account in azure databricks

There is spark command which writes the output dataframe in delta format inside a container omega from python notebook

when to try to read a delta file from this omega container using spark,it throw the below error

omega_2022_06_06_path = 'dbfs:/mnt/omega/'  + 'part-00000-234567-c000.snappy.parquet'

omega_2022_06_07_path = 'dbfs:/mnt/omega/'  + 'part-00000-987898-c000.snappy.parquet'


omega_06_06_DF = spark.read.format("delta").load(omega_2022_06_06_path)
omega_06_07_DF = spark.read.format("delta").load(omega_2022_06_07_path)



 AnalysisException: A partition path fragment should be the form like `part1=foo/part2=bar`. The partition path:part-00000-234567-c000.snappy.parquet

I am not sure what is partition fragment here, This omega container simply has some delta files, basically there is no directory inside omega container

Can someone help me how do i resolve this issue

If you need to read only specific files, then you need to read them using the parquet format, not delta . The delta format represents a table as a whole (all data files and metadata), not specific pieces. If you need to extract specific data from Delta table usually you do spark.read.load and then use .filter to limit the scope to necessary data.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM