简体   繁体   中英

Converting separate hour/min/sec columns into a single time column with pandas?

I'm trying to create a single time-column that I can create a time-series plot by resampling the date/time index. However I'm trouble combining the columns to a singular column and/or indexing it. Below is my code and what I've tried to do. Any suggestions would be appreciated!

colnames=['time_ms','power','chisq','stations','alt','hour','min','sec','time_frac','lat','lon']
df = pd.read_csv('/data/selected_lma_matlab_20210914.txt',delim_whitespace=True, header=None, names=colnames)
#df = pd.read_csv('/data/selected_lma_matlab_20210914.txt',delim_whitespace=True, header=None,names=colnames,parse_dates=[[5, 7]], index_col=0)
#df = pd.read_csv('/data/selected_lma_matlab_20210914.txt',delim_whitespace=True, header=None,names=colnames,infer_datetime_format=True,parse_dates=[[5, 6]], index_col=0)

I did try this method to include/add the date as well, which isn't necessary I believe but would be nice for consistency. However I wasn't able to get this to work.

s = df['hour'].mul(10000) + df['min'].mul(100) + df['sec']
df['date'] = pd.to_datetime('2021-09-14 ' + s.astype(int), format='%Y-%m-%d %H%M%S.%f')

This method did work to create a new column, but had trouble indexing it.

df['time'] = (pd.to_datetime(df['hour'].astype(str) + ':' + df['min'].astype(str), format='%H:%M')
         .dt.time)
df['Datetime'] = pd.to_datetime(df['time'])
df.set_index('Datetime')

Creating this column to get counts for a time-series

df['tot'] = 1 

Using this to resample the data necessary for the timeseries in a new df

df2 = df[['tot']].resample('5min').sum() 

However I keep getting datetime/index errors despite what I've tried above.

Link to data: https://drive.google.com/file/d/16GmXfQNMK81aAbB6C-W_Bjm2mcOVrILP/view?usp=sharing

you should try and keep all the data in different columns as string, concatenate them and then convert it to datatime. Below updated code would do this...

colnames=['time_ms','power','chisq','stations','alt','hour','min','sec','time_frac','lat','lon']
df = pd.read_csv('selected_lma_matlab_20210914.txt',delim_whitespace=True, header=None, names=colnames)
df['date'] = '2021-09-14 ' + df['hour'].astype('string') + ":" + df['min'].astype('string') + ":" + df['sec'].astype('string')
df['date'] = pd.to_datetime(df['date'], format='%Y-%m-%d %H:%M:%S')
df.set_index('date', inplace=True)

Post this you can do the plots as you need. I tried these and they appear to work fine...

df.alt.plot(kind='line')
df.plot('lat', 'lon', kind='scatter')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM