I have a dataframe with two date columns .Now I need to get the difference and the results should be seconds
UNIX_TIMESTAMP(SUBSTR(date1, 1, 19)) - UNIX_TIMESTAMP(SUBSTR(date2, 1, 19)) AS delta
that hive query I am trying to convert into dataframe query using scala
df.select(col("date").substr(1,19)-col("poll_date").substr(1,19))
from here I am not able to convert into seconds , Can any body help on this .Thanks in advance
Using DataFrame API, you can calculate the date difference in seconds simply by subtracting one column from the other in unix_timestamp
:
val df = Seq(
("2018-03-05 09:00:00", "2018-03-05 09:01:30"),
("2018-03-06 08:30:00", "2018-03-08 15:00:15")
).toDF("date1", "date2")
df.withColumn("tsdiff", unix_timestamp($"date2") - unix_timestamp($"date1")).
show
// +-------------------+-------------------+------+
// | date1| date2|tsdiff|
// +-------------------+-------------------+------+
// |2018-03-05 09:00:00|2018-03-05 09:01:30| 90|
// |2018-03-06 08:30:00|2018-03-08 15:00:15|196215|
// +-------------------+-------------------+------+
You could perform the calculation in Spark SQL as well, if necessary:
df.createOrReplaceTempView("dfview")
spark.sql("""
select date1, date2, (unix_timestamp(date2) - unix_timestamp(date1)) as tsdiff
from dfview
""")
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.