简体   繁体   English

从 PySpark python 中的日期获取工作日名称

[英]Get weekday name from date in PySpark python

I use this code to return the day name from a date of type string:我使用此代码从字符串类型的日期返回日期名称:

import Pandas as pd
df = pd.Timestamp("2019-04-10")
print(df.weekday_name)

so when I have "2019-04-10" the code returns "Wednesday"所以当我有“2019-04-10”时,代码返回“星期三”

I would like to apply it a column in Pyspark DataFrame to get the day name in text.我想将它应用于 Pyspark DataFrame 中的一列以获取文本中的日期名称。 But it doesn't seem to work.但它似乎不起作用。

>+-------------+
|Reported Date|
+-------------+
|    1/07/2010|
|    1/07/2010|
|    1/07/2010|
|    1/07/2010|
|    1/07/2010|
|    1/07/2010|
|    1/07/2010|    
+-------------+

I tried to do this:我试图这样做:

sparkDF.withColumn("day",weekday_name(pd.Timestamp('Reported Date')))

But I get an Error massage: NameError: name 'weekday_name' is not defined但我收到一个错误消息:NameError: name 'weekday_name' is not defined

Can anyone help me with this?谁能帮我这个? thanks谢谢

PySpark documentation is a bit unclear on this topic but it internally uses Java Date formats. PySpark 文档在这个主题上有点不清楚,但它在内部使用 Java 日期格式。

you can use like this:你可以这样使用:

df.show()
+----------+
|      date|
+----------+
|2010-01-07|
+----------+

df.printSchema()
root
 |-- date: date (nullable = true)

Now, To get the short name of weekday we can use E/EE/EEE and if you want the full name of then to need to give more than 3Es like EEEE现在,要获得工作日的短名称,我们可以使用E/EE/EEE ,如果您想要完整的名称,则需要提供超过 3E 的名称,例如EEEE

Short form:简写:

import pyspark.sql.functions as f

df.withColumn('Day', f.date_format('date', 'E')).show()
+----------+---+
|      date|Day|
+----------+---+
|2010-01-07|Thu|
+----------+---+

Full:满的:

df.withColumn('Day', f.date_format('date', 'EEEE')).show()
+----------+--------+
|      date|     Day|
+----------+--------+
|2010-01-07|Thursday|
+----------+--------+

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM