简体   繁体   English

熊猫数据透视表并按多个值排序

[英]Pandas pivot table and sort by multiple values

I am trying to sort this table first by IN_FID ascending from top to bottom, and then by Jurisdiction, ascending from left to right. 我试图先按从上到下的IN_FID排序此表,然后再按从左到右的司法管辖区排序。 I was able to pivot the table and sort by IN_FID, but how do I add a second sort from left to right. 我能够透视表并按IN_FID进行排序,但是如何从左到右添加第二种排序。

df = pd.read_csv(r'C:my\path\myfile.csv')

df['Key']=df.groupby('IN_FID').cumcount()+1
s=df.pivot_table(index='IN_FID',columns='Key',values=['Jurisdiction','CURR_VOL'],aggfunc='first')
s=s.sort_index(level=1,axis=1)
s.columns=s.columns.map('{0[0]}_{0[1]}'.format)                   

s.to_csv(r'C:\my\path\mynewfile.csv')

Where myfile.csv looks like this: myfile.csv如下所示:

ROUTE_NAME  CURR_VOL    IN_FID  NEAR_RANK   Jurisdiction
test1       test1       1       test1       2
test1       test1       1       test1       3
test2       test2       2       test2       1
test3       test3       3       test3       2
test3       test3       3       test3       1

And mynewfile.csv would look like this: 而mynewfile.csv将如下所示:

IN_FID  CURR_VOL_1  Jurisdiction_1  CURR_VOL_2  Jurisdiction_2
1       test1       2               test1       3
2       test2       1       
3       test3       1               test3       2

Currently mynewfile.csv looks like this: 当前,mynewfile.csv如下所示:

IN_FID  CURR_VOL_1  Jurisdiction_1  CURR_VOL_2  Jurisdiction_2
1       test1       2               test1       3
2       test2       1       
3       test3       2               test3       1

Any tips would be greatly appreciated. 任何提示将非常感谢。

You can try this using groupby on IN_FID and then do unstack() . 您可以在IN_FID上使用groupby尝试此操作,然后执行IN_FID unstack()

df_new = df.sort_values(['IN_FID','Jurisdiction']) \
         .groupby('IN_FID')['CURR_VOL','Jurisdiction'] \
         .apply(lambda x: pd.DataFrame(x.values, columns['CURR_VOL','Jurisdiction'])) \
         .unstack().sort_index(1, level=1)

df_new.columns = df_new.columns.droplevel(1)
df_new.reset_index(inplace=True)

Output looks like this 输出看起来像这样

IN_FID  CURR_VOL    Jurisdiction    CURR_VOL    Jurisdiction
1       test1       2               test1       3
2       test2       1               None        None
3       test3       1               test3       2

Now you can use this df_new by renaming the columns as you like. 现在,您可以根据需要重命名各列,从而使用此df_new

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM