简体   繁体   中英

Convert string date to a different format in pandas dataframe

I have been looking for this answer in the community so far, could not have.

I have a dataframe in python 3.5.1 that contains a column with dates in string imported from a CSV file.

The dataframe looks like this

                  TimeStamp  TBD  TBD     Value  TBD
0       2016/06/08 17:19:53  NaN  NaN  0.062942  NaN
1       2016/06/08 17:19:54  NaN  NaN  0.062942  NaN
2       2016/06/08 17:19:54  NaN  NaN  0.062942  NaN

what I need is to change the TimeStamp column format to be %m/%d/%y %H:%M:%D

                  TimeStamp  TBD  TBD     Value  TBD
0       06/08/2016 17:19:53  NaN  NaN  0.062942  NaN

So far I have found some solutions that works but for string and not for series

Any help would be appreciated

Thanks

If you convert the column of strings to a time series, you could use the dt.strftime method :

import numpy as np
import pandas as pd
nan = np.nan
df = pd.DataFrame({'TBD': [nan, nan, nan], 'TBD.1': [nan, nan, nan], 'TBD.2': [nan, nan, nan], 'TimeStamp': ['2016/06/08 17:19:53', '2016/06/08 17:19:54', '2016/06/08 17:19:54'], 'Value': [0.062941999999999998, 0.062941999999999998, 0.062941999999999998]})
df['TimeStamp'] = pd.to_datetime(df['TimeStamp']).dt.strftime('%m/%d/%Y %H:%M:%S')
print(df)

yields

   TBD  TBD.1  TBD.2            TimeStamp     Value
0  NaN    NaN    NaN  06/08/2016 17:19:53  0.062942
1  NaN    NaN    NaN  06/08/2016 17:19:54  0.062942
2  NaN    NaN    NaN  06/08/2016 17:19:54  0.062942

Since you want to convert a column of strings to another (different) column of strings, you could also use the vectorized str.replace method:

import numpy as np
import pandas as pd
nan = np.nan
df = pd.DataFrame({'TBD': [nan, nan, nan], 'TBD.1': [nan, nan, nan], 'TBD.2': [nan, nan, nan], 'TimeStamp': ['2016/06/08 17:19:53', '2016/06/08 17:19:54', '2016/06/08 17:19:54'], 'Value': [0.062941999999999998, 0.062941999999999998, 0.062941999999999998]})
df['TimeStamp'] = df['TimeStamp'].str.replace(r'(\d+)/(\d+)/(\d+)(.*)', r'\2/\3/\1\4')
print(df)

since

In [32]: df['TimeStamp'].str.replace(r'(\d+)/(\d+)/(\d+)(.*)', r'\2/\3/\1\4')
Out[32]: 
0    06/08/2016 17:19:53
1    06/08/2016 17:19:54
2    06/08/2016 17:19:54
Name: TimeStamp, dtype: object

This uses regex to rearrange pieces of the string without first parsing the string as a date . This is faster than the first method (mainly because it skips the parsing step), but it also has the disadvantage of not checking that the date strings are valid dates.

For most common date and datetime formats, pandas .to_datetime function can parse them without we providing format. For example:

df.TimeStamp.apply(lambda x: pd.to_datetime(x))

And in the example given from the question,

df['TimeStamp'] = pd.to_datetime(df['TimeStamp']).dt.strftime('%m/%d/%Y %H:%M:%S')

will give us the same result.

Using .apply will be efficient if you have multiple columns.

Of course, providing the parsing format is necessary for many situations. For a full list of formats, please see https://docs.python.org/3/library/datetime.html .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM