[英]sorting string dataframe by python pandas
我在輸出中有一個列,希望從General-0開始按升序遞增。 我在下面嘗試了它不起作用。如何完成此工作? dtype顯示為object。
dt.sort_values('run')
產量
run
717 General-25-20180121-15:27:27-3807
824 General-26-20180121-15:27:28-3812
931 General-27-20180121-15:27:29-3818
1038 General-28-20180121-15:27:30-3823
1145 General-29-20180121-15:27:30-3828
1252 General-30-20180121-15:27:31-3833
1359 General-31-20180121-15:27:31-3838
1466 General-32-20180121-15:27:32-3843
1573 General-33-20180121-15:27:33-3848
1680 General-34-20180121-15:27:33-3855
1787 General-0-20180121-15:27:08-3680
1894 General-1-20180121-15:27:09-3685
2001 General-2-20180121-15:27:10-3690
2108 General-3-20180121-15:27:11-3695
2215 General-4-20180121-15:27:11-3700
2322 General-5-20180121-15:27:12-3706
最簡單的是如果索引值不重要,請使用sorted
自定義函數sorted
:
df['run'] = sorted(df['run'], key=lambda x: int(x.split('-')[1]))
print (df)
run
717 General-0-20180121-15:27:08-3680
824 General-1-20180121-15:27:09-3685
931 General-2-20180121-15:27:10-3690
1038 General-3-20180121-15:27:11-3695
1145 General-4-20180121-15:27:11-3700
1252 General-5-20180121-15:27:12-3706
1359 General-25-20180121-15:27:27-3807
1466 General-26-20180121-15:27:28-3812
1573 General-27-20180121-15:27:29-3818
1680 General-28-20180121-15:27:30-3823
1787 General-29-20180121-15:27:30-3828
1894 General-30-20180121-15:27:31-3833
2001 General-31-20180121-15:27:31-3838
2108 General-32-20180121-15:27:32-3843
2215 General-33-20180121-15:27:33-3848
2322 General-34-20180121-15:27:33-3855
如果索引值是重要的第一個split
,請通過str[1]
選擇第二個值,將其轉換為整數,並使用帶有iloc
argsort
進行iloc
:
df = df.iloc[df['run'].str.split('-').str[1].astype(int).argsort()]
print (df)
run
1787 General-0-20180121-15:27:08-3680
1894 General-1-20180121-15:27:09-3685
2001 General-2-20180121-15:27:10-3690
2108 General-3-20180121-15:27:11-3695
2215 General-4-20180121-15:27:11-3700
2322 General-5-20180121-15:27:12-3706
717 General-25-20180121-15:27:27-3807
824 General-26-20180121-15:27:28-3812
931 General-27-20180121-15:27:29-3818
1038 General-28-20180121-15:27:30-3823
1145 General-29-20180121-15:27:30-3828
1252 General-30-20180121-15:27:31-3833
1359 General-31-20180121-15:27:31-3838
1466 General-32-20180121-15:27:32-3843
1573 General-33-20180121-15:27:33-3848
1680 General-34-20180121-15:27:33-3855
您可以使用split
為您的排序建立一個幫助鍵,然后在完成后將其drop
df.assign(helpkey=df.run.str.split('-',expand=True)[1].astype(int)).sort_values('helpkey').drop('helpkey',1)
Out[750]:
run
1787 General-0-20180121-15:27:08-3680
1894 General-1-20180121-15:27:09-3685
2001 General-2-20180121-15:27:10-3690
2108 General-3-20180121-15:27:11-3695
2215 General-4-20180121-15:27:11-3700
2322 General-5-20180121-15:27:12-3706
717 General-25-20180121-15:27:27-3807
824 General-26-20180121-15:27:28-3812
931 General-27-20180121-15:27:29-3818
1038 General-28-20180121-15:27:30-3823
1145 General-29-20180121-15:27:30-3828
1252 General-30-20180121-15:27:31-3833
1359 General-31-20180121-15:27:31-3838
1466 General-32-20180121-15:27:32-3843
1573 General-33-20180121-15:27:33-3848
1680 General-34-20180121-15:27:33-3855
您可以將numpy.argsort
與pd.DataFrame.iloc
一起使用。
此方法維護原始數據幀的索引。
res = df.iloc[np.argsort([int(i.split('-')[1]) for i in df['run']])]
print(res)
# run
# 1787 General-0-20180121-15:27:08-3680
# 1894 General-1-20180121-15:27:09-3685
# 2001 General-2-20180121-15:27:10-3690
# 2108 General-3-20180121-15:27:11-3695
# 2215 General-4-20180121-15:27:11-3700
# 2322 General-5-20180121-15:27:12-3706
# 717 General-25-20180121-15:27:27-3807
# 824 General-26-20180121-15:27:28-3812
# 931 General-27-20180121-15:27:29-3818
# 1038 General-28-20180121-15:27:30-3823
# 1145 General-29-20180121-15:27:30-3828
# 1252 General-30-20180121-15:27:31-3833
# 1359 General-31-20180121-15:27:31-3838
# 1466 General-32-20180121-15:27:32-3843
# 1573 General-33-20180121-15:27:33-3848
# 1680 General-34-20180121-15:27:33-3855
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.