簡體   English   中英

日期和時間列中的spark scala split timestamp列

[英]spark scala split timestamp column in date column and time column

將時間戳列拆分為日期和時間列時遇到問題。 首先,時間不考慮24h格式...其次,日期不正確,我也不明白為什么

這是我的輸出

+----------+----------+-------------------+---------+
|      Date| Timestamp|               Time|EventTime|
+----------+----------+-------------------+---------+
|2018-00-30|1540857600|2018-10-30 00:00:00| 12:00:00|
|2018-00-30|1540857610|2018-10-30 00:00:10| 12:00:10|
|2018-00-30|1540857620|2018-10-30 00:00:20| 12:00:20|
|2018-00-30|1540857630|2018-10-30 00:00:30| 12:00:30|
|2018-00-30|1540857640|2018-10-30 00:00:40| 12:00:40|
|2018-00-30|1540857650|2018-10-30 00:00:50| 12:00:50|
|2018-01-30|1540857660|2018-10-30 00:01:00| 12:01:00|
|2018-01-30|1540857670|2018-10-30 00:01:10| 12:01:10|
|2018-01-30|1540857680|2018-10-30 00:01:20| 12:01:20|
|2018-01-30|1540857690|2018-10-30 00:01:30| 12:01:30|
|2018-01-30|1540857700|2018-10-30 00:01:40| 12:01:40|

和我的代碼:

  val df = data_input
    .withColumn("Time", to_timestamp(from_unixtime(col("Timestamp"))))
    .withColumn("Date", date_format(col("Time"), "yyyy-mm-dd"))
    .withColumn("EventTime", date_format(col("Time"), "hh:mm:ss"))

首先,我將unix的“時間戳記”列轉換為“時間”列,然后我想分割時間。

先感謝您

您使用了錯誤的格式代碼。 具體來說,日期中的“ mm”代表分鍾,“ hh”代表12小時值。 相反,您需要“ MM”和“ HH”。 像這樣:

val df = data_input
    .withColumn("Time", to_timestamp(from_unixtime(col("Timestamp"))))
    .withColumn("Date", date_format(col("Time"), "yyyy-MM-dd"))
    .withColumn("EventTime", date_format(col("Time"), "HH:mm:ss"))

作為參考,以下是可以使用的日期格式代碼: SimpleDateFormat

您可以通過簡單的鑄造避免混淆

import org.apache.spark.sql.functions._

val df = data_input
    .withColumn("Time", $"Timestamp".cast("timestamp"))
    .withColumn("Date", $"Time".cast("date"))
    .withColumn("EventTime", date_format($"Time", "H:m:s"))

+----------+-------------------+----------+---------+
|Timestamp |               Time|      Date|EventTime|
+----------+-------------------+----------+---------+
|1540857600|2018-10-30 00:00:00|2018-10-30|    0:0:0|
|1540857610|2018-10-30 00:00:10|2018-10-30|   0:0:10|
|1540857620|2018-10-30 00:00:20|2018-10-30|   0:0:20|
|1540857630|2018-10-30 00:00:30|2018-10-30|   0:0:30|
|1540857640|2018-10-30 00:00:40|2018-10-30|   0:0:40|
|1540857650|2018-10-30 00:00:50|2018-10-30|   0:0:50|
|1540857660|2018-10-30 00:01:00|2018-10-30|    0:1:0|
|1540857670|2018-10-30 00:01:10|2018-10-30|   0:1:10|
|1540857680|2018-10-30 00:01:20|2018-10-30|   0:1:20|
|1540857690|2018-10-30 00:01:30|2018-10-30|   0:1:30|
|1540857700|2018-10-30 00:01:40|2018-10-30|   0:1:40|
+----------+-------------------+----------+---------+

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM