I am using AutoLoader in databricks. However when I save the stream as a delta table, the generated table is NOT delta.
.writeStream
.format("delta") # <-----------
.option("checkpointLocation", checkpoint_path)
.option("path", output_path)
.trigger(availableNow=True)
.toTable(table_name))
delta.DeltaTable.isDeltaTable(spark, table_name)
> false
Why is the generated table not delta format? If I try to read the table using spark.read(table_name)
it works but if I am trying to use Redash or the builtin databricks' Data tab it produces an error and the schema is not well parsed.
An error occurred while fetching table: table_name com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: Incompatible format detected A transaction log for Databricks Delta was found at
s3://delta/_delta_log
, but you are trying to read froms3://delta
using format("parquet"). You must use 'format("delta")' when reading and writing to a delta table.
Could you try this:
(
spark
.writeStream
.option("checkpointLocation", <checkpointLocation_path>)
.trigger(availableNow=True)
.table("<table_name>")
)
Instead of toTable
can you try table
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.