[英]pyspark convert dataframe column from timestamp to string of "YYYY-MM-DD" format
In pyspark is there a way to convert a dataframe column of timestamp datatype to a string of format 'YYYY-MM-DD' format?在 pyspark 中,有没有办法将时间戳数据类型的 dataframe 列转换为格式为“YYYY-MM-DD”格式的字符串?
You can use date_format function as below 您可以使用date_format函数,如下所示
from pyspark.sql.functions import date_format
df.withColumn("dateColumn", date_format(col("vacationdate"), "yyyy-MM-dd"))
Hope this helps! 希望这可以帮助!
If you have a column with schema
as 如果您有一个包含schema
的列
root
|-- date: timestamp (nullable = true)
Then you can use from_unixtime
function to convert the timestamp to string after converting the timestamp to bigInt using unix_timestamp
function as 然后你可以使用from_unixtime
使用时间戳转换为BIGINT后函数将时间戳转换成字符串 unix_timestamp
功能
from pyspark.sql import functions as f
df.withColumn("date", f.from_unixtime(f.unix_timestamp(df.date), "yyyy-MM-dd"))
and you should have 你应该有
root
|-- date: string (nullable = true)
one other option to try out will be尝试的另一种选择是
from pyspark.sql import functions as F从 pyspark.sql 导入函数作为 F
df = df.withColumn('new_time_column', F.to_timestamp(df['Time_column'], 'yyyy-MM-dd')) df = df.withColumn('new_time_column', F.to_timestamp(df['Time_column'], 'yyyy-MM-dd'))
from pyspark.sql.functions import date_format
df.withColumn("DateOnly", date_format('DateTime', "yyyy-MM-dd")).show()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.