简体   繁体   English

如何获取增量表的最新插入时间?

[英]How to get the latest insertion time for a delta table?

In my Spark structured streaming application, I have a code like this.在我的 Spark 结构化流应用程序中,我有一个这样的代码。

df = (
  spark.readStream.format("delta")
  .option("startingTimestamp", starting_time_stamp)
  .table(t)
)

Now if the given starting timestamp is later than the timestamp on which the last insertion was done, I get an error.现在,如果给定的起始时间戳晚于上次插入完成的时间戳,我会收到错误消息。 So, my question is how can I check for the latest timestamp on which an insertion was committed?所以,我的问题是如何检查提交插入的最新时间戳?

我们可以通过以下代码找到最新的时间戳。

ts = spark.sql("SELECT max(timestamp) FROM (DESCRIBE HISTORY <table>)").first()[0]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM