简体   繁体   中英

Convert string to date format in python pandas dataframe

I have a data set the the following date format in a pandas data frame:

warnings = pd.read_csv('output.csv', sep=',')
warnung['from']

7      Di, 15. Aug, 21:52 Uhr
8      Di, 15. Aug, 22:46 Uhr
9      Di, 15. Aug, 22:46 Uhr
10     Di, 15. Aug, 21:52 Uhr
11     Di, 15. Aug, 22:46 Uhr
12     Di, 15. Aug, 21:52 Uhr
13     Di, 15. Aug, 22:46 Uhr
14     Di, 15. Aug, 21:52 Uhr
15     Di, 15. Aug, 22:46 Uhr

Here my question, how can I convert this to a legible date format in pandas. I want to compare if the actual date of today and match this to date from my data-set.

I would like to have, eg

15.08.2017, 22:46:00

or in a more convenient format. Then I want to compare the actual date against the dates in my data set.

How can I do this within a pandas DataFrame.

Thanks for any help.

I think you need to_datetime , but first remove first 4 and last 4 chars by indexing with str and radd for 2017 year:

df['new'] = pd.to_datetime(df['from'].str[4:-4].radd('2017-'), format='%Y-%d. %b, %H:%M')
print (df)
                     from                 new
0  Di, 15. Aug, 21:52 Uhr 2017-08-15 21:52:00
1  Di, 15. Aug, 22:46 Uhr 2017-08-15 22:46:00
2  Di, 15. Aug, 22:46 Uhr 2017-08-15 22:46:00
3  Di, 15. Aug, 21:52 Uhr 2017-08-15 21:52:00
4  Di, 15. Aug, 22:46 Uhr 2017-08-15 22:46:00
5  Di, 15. Aug, 21:52 Uhr 2017-08-15 21:52:00
6  Di, 15. Aug, 22:46 Uhr 2017-08-15 22:46:00
7  Di, 15. Aug, 21:52 Uhr 2017-08-15 21:52:00
8  Di, 15. Aug, 22:46 Uhr 2017-08-15 22:46:00

Last for compare with today date use boolean indexing with date for convert pandas datetimes to python dates:

today_date = pd.datetime.today().date()

df1 = df[df['new'].dt.date == today_date]

Here's my attempt at it, I think it should work, although I'm not sure on the process you want to use for checking if it's the current date.

The first part will tidy things up slightly and take the string of every row and convert it to a date time object.

The second part of this that does check will spit out a column that gives either True/False based on your system clock for each row. This was done with python 3.5.2.

import string
import pandas as pd
import datetime

#Converts each string into a datetime object
def convert_date(row):
    trim_date = row[4:-4]
    remove_punc = trim_date.translate(trim_date.maketrans('','',string.punctuation))
    return datetime.datetime.strptime('2017 ' + remove_punc, '%Y %d %b %H%M')

df['datetime_convert'] = df['from'].apply(convert_date)

#Creates column to check if every value matches the current time on your system
def check_is_now(row):
    if str(row) == datetime.datetime.today().strftime('%Y-%m-%d %H:%M:00')::
        return True
    else:
        return False


df['is_now'] = df['datetime_convert'].apply(check_is_now)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM