The following code segment reads in sample_data.csv
into a DataFrame and sorts by observation_time
. The sort()
routine doesn't properly sort the date/time.
df = pd.read_csv('C:/data/sample_data.csv')
df = df.sort(['observation_time'])
df.to_csv('C:/data/outfile.csv')
[using Pandas 0.12 from Anaconda 32-bit windows]
Input sample_data.csv
:
latitude longitude vessel_name observation_time
1.031000018 -79.68883514 aqasi 12/28/2012 10:40
1.032833338 -79.71916199 aqasi 12/29/2012 14:06
1.486500025 -80.1906662 aqasi 12/31/2012 4:41
1.466999888 -80.16249847 aqasi 12/31/2012 4:30
2.342833519 -81.46682739 aqasi 12/31/2012 13:40
2.360000134 -81.4936676 aqasi 12/31/2012 13:51
3.816000223 -83.68183899 aqasi 1/1/2013 5:20
3.730499983 -83.55400085 aqasi 1/1/2013 4:24
3.714666843 -83.53016663 aqasi 1/1/2013 4:14
4.986999989 -85.45566559 aqasi 1/1/2013 19:04
6.884333134 -88.21949768 aqasi 1/2/2013 13:11
6.885833263 -88.22200012 aqasi 1/2/2013 13:12
6.886833191 -88.22383881 aqasi 1/2/2013 13:12
6.887333393 -88.22450256 aqasi 1/2/2013 13:13
6.889333248 -88.22800446 aqasi 1/2/2013 13:14
Output outfile.csv
:
latitude longitude vessel_name observation_time
9 4.986999989 -85.45566559 aqasi 1/1/2013 19:04
8 3.714666843 -83.53016663 aqasi 1/1/2013 4:14
7 3.730499983 -83.55400085 aqasi 1/1/2013 4:24
6 3.816000223 -83.68183899 aqasi 1/1/2013 5:20
10 6.884333134 -88.21949768 aqasi 1/2/2013 13:11
12 6.886833191 -88.22383881 aqasi 1/2/2013 13:12
11 6.885833263 -88.22200012 aqasi 1/2/2013 13:12
13 6.887333393 -88.22450256 aqasi 1/2/2013 13:13
14 6.889333248 -88.22800446 aqasi 1/2/2013 13:14
0 1.031000018 -79.68883514 aqasi 12/28/2012 10:40
1 1.032833338 -79.71916199 aqasi 12/29/2012 14:06
4 2.342833519 -81.46682739 aqasi 12/31/2012 13:40
5 2.360000134 -81.4936676 aqasi 12/31/2012 13:51
3 1.466999888 -80.16249847 aqasi 12/31/2012 4:30
2 1.486500025 -80.1906662 aqasi 12/31/2012 4:41
The reason for the 'wrong' sorting is that the observation_time
column is a string
type column, so it is in fact correctly sorted as string
.
Converting it to datetime
type before sorting will produce the desired result:
df['observation_time'] = pd.to_datetime(df['observation_time'])
Disclaimer: In the spirit of this meta post , I've explicitly added the solution from this and this comment above as an answer, since people tend not to read comments and answers are more permanent (and have the benefit that they can be upvoted/downvoted to indicate usefulness).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.