简体   繁体   中英

Converting timestamp column in custom string format to seconds in python

I have a pandas dataframe with multiple columns containing timestamp values in string format. The values are like '0D_0H_1M_35S_0MS'. I would like to convert all these column values into milliseconds

在此处输入图像描述

First replace string to format for possible convert to timedeltas by Series.replace with to_timedelta , convert to seconds by Series.dt.total_seconds and multiple by 1000 for miliseconds per columns in custom function in DataFrame.apply :

cols = ['date','date1']

def parse(x):
    s =  x.replace({'D_':' Days ','H_':':','M_':':','S_':'.','MS':''}, regex=True)
    return pd.to_timedelta(s).dt.total_seconds().mul(1000).astype(int)
    
df[cols] = df[cols].apply(parse)
print (df)
     date   date1
0   95000   95000
1  155000  155000

If format is never change is possible extract digits by Series.str.extractall , then reshape by Series.unstack , convert to integers and multiple by unit Series, last sum values.

Solution is inspired by deleted answer.

df = pd.DataFrame({'date':['0D_0H_1M_35S_0MS','0D_0H_2M_35S_0MS'],
                   'date1':['0D_0H_1M_35S_0MS','0D_0H_2M_35S_0MS']})

cols = ['date','date1']

def parse(x):
    a = pd.Series([86400000, 3600000, 60000, 1000, 1])
    return x.str.extractall('(\d+)')[0].unstack().astype(int).mul(a).sum(axis=1)
    
df[cols] = df[cols].apply(parse)

print (df)
     date   date1
0   95000   95000
1  155000  155000

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM