pd.sort_values没有做什么

Question

I have a csv file which I've already imported using df = pd.read_csv("af.csv") 我有一个已经使用df = pd.read_csv("af.csv")导入的csv文件

The CSV file looks like this (preview): CSV文件如下所示（预览）：

"match_id","start_time","win","leaguename","opposing_team","team","min"
2992096687,1486840800,True,"CaptainsDraft",3729377,2642171,1453382256
2992217489,1486845476,true,"Captains Draft",3729377,2642171,1453382256
2994454005,1486926905,false,"Captains Draft",2586976,2642171,1453382256
2659805546,1474478411,false,"BTSSeries",55,2642171,1454281287
2659879628,1474481141,false,"BTSSeries",55,2642171,1454281287
2661783205,1474563571,false,"BTSSeries",2537636,2642171,1454281287
2661875544,1474566865,false,"BTSSeries",2537636,2642171,1454281287
2662027296,1474573160,true,"BTSSeries",59,2642171,1454281287
2758086417,1478352060,true,"ESLManila16",2163,2642171,1454692269
2758241073,1478355547,true,"ESLManila16",2163,2642171,1454692269
2747710178,1477941012,false,"ESLFrankfurt16",2850016,2642171,1459782261
2747808587,1477945318,true,"ESLFrankfurt16",2850016,2642171,1459782261
2747861268,1477947994,true,"ESLFrankfurt16",2850016,2642171,1459782261

Now what I'm trying to do is keep the first match of a league followed by the number of wins (True being a win, and False being a loss) of all matches on that league and then sorting it by start_time 现在，我想要做的就是保持联赛之后（真正是一个双赢，而假是一个损失）在该联盟 的所有比赛 ，然后由START_TIME分类整理胜场数的第一次比赛

I have below code to do this: 我有以下代码可以做到这一点：

df1 = df.groupby(['leaguename', 'team']).sum().reset_index()
df1 = df1[['win','leaguename','team']]

df2 = df.sort_values("start_time").groupby("leaguename", as_index=False).first()
df2 = df2[['leaguename', 'start_time']]

output = pd.merge(df1, df2, 'inner', on = 'leaguename')

The output returns with jumbled unordered start_time: 输出以混乱的无序start_time返回：

,win,leaguename,team,start_time
0,5.0,ASUSROGSeason6,2642171,1478022101
1,6.0,CaptainsDraft,2642171,1486840800
2,3.0,Dota2Asia17,2642171,1486130597
3,2.0,DotaPitSeason5,2642171,1476903919
4,5.0,ESLFrankfurt16,2642171,1477941012
5,2.0,ESLManila16,2642171,1478352060
6,6.0,GlobalGrandMasters,2642171,1466176095
7,4.0,NanyangChampionshipsSeason2,2642171,1464178206

Desired output: 所需的输出：

,win,leaguename,team,start_time
0,4.0,NanyangChampionshipsSeason2,2642171,1464178206
1,6.0,GlobalGrandMasters,2642171,1466176095
2,2.0,DotaPitSeason5,2642171,1476903919
3,5.0,ESLFrankfurt16,2642171,1477941012
4,5.0,ASUSROGSeason6,2642171,1478022101
5,2.0,ESLManila16,2642171,1478352060
6,3.0,Dota2Asia17,2642171,1486130597
7,6.0,CaptainsDraft,2642171,1486840800

How can I achieve desired output? 如何获得所需的输出？

Answer 1

I think you need DataFrame.sort_values by column start_time with DataFrame.reset_index and parameter drop=True for default unique monotonic index: 我认为您需要按数据DataFrame.sort_values为start_time的DataFrame.reset_index使用DataFrame.reset_index和参数drop=True作为默认唯一单调索引：

output = output.sort_values('start_time').reset_index(drop=True)
#data by output sample
print (output)
   win                   leaguename     team  start_time
0  4.0  NanyangChampionshipsSeason2  2642171  1464178206
1  6.0           GlobalGrandMasters  2642171  1466176095
2  2.0               DotaPitSeason5  2642171  1476903919
3  5.0               ESLFrankfurt16  2642171  1477941012
4  5.0               ASUSROGSeason6  2642171  1478022101
5  2.0                  ESLManila16  2642171  1478352060
6  3.0                  Dota2Asia17  2642171  1486130597
7  6.0                CaptainsDraft  2642171  1486840800

Another solution is add sort=False to both groupby : 另一种解决方案是对两个groupby都添加sort=False ：

df1 = df.groupby(['leaguename', 'team'], sort=False).sum().reset_index()
df1 = df1[['win','leaguename','team']]

df2 = df.sort_values("start_time").groupby("leaguename", as_index=False, sort=False).first()
df2 = df2[['leaguename', 'start_time']]

output = pd.merge(df1, df2,  on = 'leaguename')
#data by input sample
print (output)
   win      leaguename     team  start_time
0  2.0  Captains Draft  2642171  1486840800
1  1.0       BTSSeries  2642171  1474478411
2  2.0     ESLManila16  2642171  1478352060
3  2.0  ESLFrankfurt16  2642171  1477941012

pd.sort_values没有做什么

问题描述

1 个解决方案

解决方案1
0 已采纳 2017-08-07 12:29:08

pd.sort_values没有做什么

问题描述

1 个解决方案

解决方案1 0 已采纳 2017-08-07 12:29:08

解决方案1
0 已采纳 2017-08-07 12:29:08