简体   繁体   中英

pyspark getting weeknumber of month

I am stuckup with getting weeknumber from month in pyspark from a datafrme column , For Examples consider my dataframe as

WeekID,DateField,WeekNUM
1,01/JAN/2017
2,15/Feb/2017

My Output should be as below

WeekIDm,DateField,MOF
1,01/JAN/2017,1
2,15/FEB/2017,2

I tried with striftime and other date functions I was unable to do.

Please help me in resolving the issue.

You can combine to_date and date_format :

from pyspark.sql.functions import to_date, date_format

df = spark.createDataFrame(
    [(1, "01/JAN/2017"), (2, "15/FEB/2017")], ("id", "date"))

df.withColumn("week", date_format(to_date("date", "dd/MMM/yyyy"), "W")).show()
+---+-----------+----+
| id|       date|week|
+---+-----------+----+
|  1|01/JAN/2017|   1|
|  2|15/FEB/2017|   3|
+---+-----------+----+

If you want week-of-year please replace format with w :

date_format(to_date("date", "dd/MMM/yyyy"), "w")

Since spark 3.0 flag w has been deprecated. So, Simply you can use PySpark inbuilt function weekofyear as follows -

import pyspark.sql.functions as funcs

(df
 .withColumn(
    'week_of_year', 
    funcs.weekofyear(funcs.col('date_announced'))).select('date_announced', 'week_of_year')
).show(5)

+-------------------+------------+
|     date_announced|week_of_year|
+-------------------+------------+
|2020-01-30 00:00:00|           5|
|2020-02-02 00:00:00|           5|
|2020-02-03 00:00:00|           6|
|2020-03-02 00:00:00|          10|
|2020-03-02 00:00:00|          10|
+-------------------+------------+

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM