简体   繁体   中英

Merge Separate columns of MM, DD, YYYY to a single column of YYYY-MM-DD using Python3.7

I have separate columns of DD, MM, YYYY. The data is in a dataframe df and has separate columsn of Day, Month, Year in int64 format

How do I merge them to create a YYYY-MM-DD format column in Python

Imagine having the test_df as below you could insert the value of each column as an argument of dt.datetime or dt.date depending on the data type you are looking for:

import pandas as pd
import datetime as dt

test_df = pd.DataFrame(data={'years':[2019, 2018, 2018],
                                'months':[10, 9, 10],
                                'day': [20, 20, 20]})
test_df['full_date']=[dt.datetime(year, month, day) for year, month,
       day in zip(test_df['years'], test_df['months'], test_df['day'])]

By pure string manipulation given that you want the final result to be a string:

# Sample data.
df = pd.DataFrame({'Year': [2018, 2019], 'Month': [12, 1], 'Day': [25, 10]})

# Solution.
>>> df.assign(
        date=df.Year.astype(str) 
             + '-' + df.Month.astype(str).str.zfill(2) 
             + '-' + df.Day.astype(str).str.zfill(2)
    )
   Year  Month  Day        date
0  2018     12   25  2018-12-25
1  2019      1   10  2019-01-10

If you prefer Timestamps instead of strings, then you can easily convert them via:

df['date'] = pd.to_datetime(df['date'])

You can use the to_datetime method

date_data_set = [{"day":1, "month":1, "year":2020}, {"day":2, "month":3, "year":2019}]

date_data_set
Out[40]: [{'day': 1, 'month': 1, 'year': 2020}, {'day': 2, 'month': 3, 'year': 2019}]

df = pd.DataFrame(date_data_set)

df
Out[42]: 
   day  month  year
0    1      1  2020
1    2      3  2019

df['date_data'] = pd.to_datetime(df['day'].astype("str")+"/"+df['month'].astype("str")+"/"+df["year"].astype("str"), format = "%d/%m/%Y")

df
Out[44]: 
   day  month  year  date_data
0    1      1  2020 2020-01-01
1    2      3  2019 2019-03-02

df.dtypes
Out[52]: 
day                   int64
month                 int64
year                  int64
date_data    datetime64[ns]
dtype: object

Use to_datetime with format parameter:

Using @emiljoj setup,

test_df = pd.DataFrame(data={'years':[2019, 2018, 2018],
                                'months':[10, 9, 10],
                                'day': [20, 20, 20]})

test_df['date'] = pd.to_datetime(test_df['years'].astype('str')+
                                 test_df['months'].astype('str')+
                                 test_df['day'].astype('str'),
                           format='%Y%m%d')

Output:

   years  months  day       date
0   2019      10   20 2019-10-20
1   2018       9   20 2018-09-20
2   2018      10   20 2018-10-20

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM