简体   繁体   中英

how to run sql query on delta table

I have problem with delta lake docs. I know that I can query on delta table with presto,hive,spark sql and other tools but in delta's documents mentioned that "You can load a Delta table as a DataFrame by specifying a table name or a path"

从三角洲湖

but it isn't clear. how can I run sql query like that?

Use the spark.sql() function

spark.sql("select * from delta.`hdfs://192.168.2.131:9000/Delta_Table/test001`").show()

To read data from tables in DeltaLake it is possible to use Java API or Python without Apache Spark . See details at: https://databricks.com/blog/2020/12/22/natively-query-your-delta-lake-with-scala-java-and-python.html

See how to use with Pandas:

pip3 install deltalake
python3
from deltalake import DeltaTable
table_path = "/opt/data/delta/my-table" # whatever table name and object store
# now using Pandas
df = DeltaTable(table_path).to_pandas()
df

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM