[英]Pandas: Sort dataframe by date string without converting
Take this simple dataframe: 拿这个简单的数据帧:
df = pd.DataFrame({
'date':['1/15/2017', '2/15/2017','10/15/2016', '3/15/2017'],
'int':[2,3,1,4]
})
I'd like to sort it by the date, and then save it to a CSV without having to: 我想按日期排序,然后将其保存为CSV,而不必:
pd.to_datetime(df['date'])
使用pd.to_datetime(df['date'])
转换日期pd.to_datetime(df['date'])
.sort_values('date')
使用.sort_values('date')
对数据帧进行排序 .strftime('%-m/%-d/%Y')
将日期转换回.strftime('%-m/%-d/%Y')
And instead do something like this (which of course, doesn't work): 而是做这样的事情(当然,这不起作用):
df.apply(pd.to_dataframe(df['date']).sort_values(by = 'date', inplace = True)
Output: 输出:
date kw
2 10/15/2016 1
0 1/15/2017 2
1 2/15/2017 3
3 3/15/2017 4
Is this possible, or should I just stick with the 3-step process? 这是可能的,还是我应该坚持三步过程?
numpy
's argsort
returns the permutation necessary for sorting an array. numpy
的argsort
返回排序数组所需的排列。 We can take advantage of that using iloc
. 我们可以利用iloc
来利用iloc
。 So by converting the dates using pd.to_datetime
then subsequently grabbing the values and calling argsort
we've done all that we need to sort the original dataframe without changing any of it's columns. 因此,通过使用pd.to_datetime
转换日期,然后获取值并调用argsort
我们已经完成了对原始数据帧进行排序所需的所有操作,而无需更改任何列。
df.iloc[pd.to_datetime(df.date).values.argsort()]
date int
2 10/15/2016 1
0 1/15/2017 2
1 2/15/2017 3
3 3/15/2017 4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.