简体   繁体   中英

How to Change a String value in Spark Dataframe

I am running a query in spark sql as mentioned below

campaign_df = spark.sql('''select CAMPAIGN_ID,CAMPAIGN_NAME,TAGS,
                                CAMPAIGN_CREATED_DATE,
                                UPDATED_DATE,
                                FIRST_SENT,LAST_SENT,
                                SCHEDULE_TYPE,CHANNELS,ARCHIVED 
                                from pipeline.campaign_details_raw''')

I am getting the date values for columns like CAMPAIGN_CREATED_DATE,UPDATED_DATE in format as '2015-01-03T17:00:07+00:00' and format for column FIRST_SENT as '2014-10-26T16:00:00Z'.I want an unique format across the dataframe as '2014-10-18T17:00:00.000+0000'for the columns mentioned above.

campaign_df.head(2)
[Row(CAMPAIGN_ID='e4b32e76-8707-4406-8c16-c31410239660', CAMPAIGN_NAME='10/18 Push: $10 off $10', TAGS='', CAMPAIGN_CREATED_DATE='2014-10-17T15:11:59+00:00', UPDATED_DATE='2014-10-18T17:00:12+00:00', FIRST_SENT='2014-10-18T17:00:00Z', LAST_SENT='2014-10-18T17:00:00Z', SCHEDULE_TYPE='time_based', CHANNELS='ios_push,', ARCHIVED='False'),
 Row(CAMPAIGN_ID='ed06f75e-6e3b-422d-8226-6d279f2be3bf', CAMPAIGN_NAME='10/26 - 40% off Everything - EARLY40', TAGS='', CAMPAIGN_CREATED_DATE='2014-10-24T15:53:06+00:00', UPDATED_DATE='2014-10-26T16:30:00+00:00', FIRST_SENT='2014-10-26T16:00:00Z', LAST_SENT='2014-10-26T16:00:00Z', SCHEDULE_TYPE='time_based', CHANNELS='ios_push,', ARCHIVED='False')]
campaign_df
campaign_df:pyspark.sql.dataframe.DataFrame
CAMPAIGN_ID:string
CAMPAIGN_NAME:string
TAGS:string
CAMPAIGN_CREATED_DATE:string
UPDATED_DATE:string
FIRST_SENT:string
LAST_SENT:string
SCHEDULE_TYPE:string
CHANNELS:string
ARCHIVED:string

Thanks in Advance!

Convert all formats to ISO timestamp

campaign_df = spark.sql('''select CAMPAIGN_ID,CAMPAIGN_NAME,TAGS,
                                cast(CAMPAIGN_CREATED_DATE as timestamp),
                                cast(UPDATED_DATE as timestamp),
                                cast(FIRST_SENT as timestamp),
                                cast(LAST_SENT as timestamp),
                                SCHEDULE_TYPE,CHANNELS,ARCHIVED 
                                from pipeline.campaign_details_raw''')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM