[英]Flatten any nested json string and convert to dataframe using spark scala
[英]Convert String to DataFrame using Spark/scala
我想將字符串列轉換為時間戳列,但它始終返回空值。
val t = unix_timestamp(col("tracking_time"),"MM/dd/yyyy").cast("timestamp")
val df= df2.withColumn("ts", t)
任何的想法 ?
謝謝你 。
確保您的String column
與指定的格式MM/dd/yyyy
匹配。
null
。 Example:
val df2=Seq(("12/12/2020")).toDF("tracking_time")
val t = unix_timestamp(col("tracking_time"),"MM/dd/yyyy").cast("timestamp")
df2.withColumn("ts", t).show()
//+-------------+-------------------+
//|tracking_time| ts|
//+-------------+-------------------+
//| 12/12/2020|2020-12-12 00:00:00|
//+-------------+-------------------+
df2.withColumn("ts",unix_timestamp(col("tracking_time"),"MM/dd/yyyy").cast("timestamp")).show()
//+-------------+-------------------+
//|tracking_time| ts|
//+-------------+-------------------+
//| 12/12/2020|2020-12-12 00:00:00|
//+-------------+-------------------+
//(or) by using to_timestamp function.
df2.withColumn("ts",to_timestamp(col("tracking_time"),"MM/dd/yyyy")).show()
//+-------------+-------------------+
//|tracking_time| ts|
//+-------------+-------------------+
//| 12/12/2020|2020-12-12 00:00:00|
//+-------------+-------------------+
正如@Shu 提到的,原因可能是tracking_time
列的格式無效。 不過值得一提的是,Spark 正在尋找模式作為列值的前綴。 研究這些例子以獲得更好的直覺
Seq(
"03/29/2020 00:00",
"03/29/2020",
"00:00 03/29/2020",
"03/29/2020somethingsomething"
).toDF("tracking_time")
.withColumn("ts", unix_timestamp(col("tracking_time"), "MM/dd/yyyy").cast("timestamp"))
.show()
//+--------------------+-------------------+
//| tracking_time| ts|
//+--------------------+-------------------+
//| 03/29/2020 00:00|2020-03-29 00:00:00|
//| 03/29/2020|2020-03-29 00:00:00|
//| 00:00 03/29/2020| null|
//|03/29/2020somethi...|2020-03-29 00:00:00|
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.