简体   繁体   English

Pandas Groupby - 没有要连接的对象

[英]Pandas Groupby - No objects to concatenate

I have a Dataframe with following values:我有一个具有以下值的数据框:

    df = 
route    stop_code      stop_name
900      92072          Eastbound @ 257 Kingston Road East   
900      1590           Kingston Westbound @ Wicks   
900      2218           Kingston Eastbound @ Wicks   
900      93152          Salem Northbound @ Kingston   
92       728            Kingston Rd. @ Salem Rd.   
224      92071          Salem Southbound @ Twilley   
215      92071          Salem Southbound @ Twilley   
215      92054          Northbound @ 133 Salem   
224      92054          Northbound @ 133 Salem   
215      93152          Salem Northbound @ Kingston   

What I want is to group routes by stop_code or stop_name, something like:我想要的是按 stop_code 或 stop_name 对路线进行分组,例如:

df2 = 
    route         stop_code      stop_name
    900           92072          Eastbound @ 257 Kingston Road East   
    900           1590           Kingston Westbound @ Wicks   
    900           2218           Kingston Eastbound @ Wicks   
    92            728            Kingston Rd. @ Salem Rd.   
    224, 215      92071          Salem Southbound @ Twilley   
    215, 215      92054          Northbound @ 133 Salem   
    215, 900      93152          Salem Northbound @ Kingston 

I tried to do the following:我尝试执行以下操作:

df2 = df.groupby(['stop_code']).agg(set).reset_index()

while it did work fine in my test environment, when I deployed it in Django (Python Anywhere), I got the following error (maybe due to different versions of Pandas / Python / Django):虽然它在我的测试环境中运行良好,但当我在 Django (Python Anywhere) 中部署它时,出现以下错误(可能是由于 Pandas / Python / Django 的不同版本):

ValueError: No objects to concatenate

can anyone please guide me sort it out?任何人都可以请指导我整理一下吗? TIA TIA

You can do it this way你可以这样做

df.groupby(['stop_code','stop_name'],sort=False)['route'].agg(list).reset_index()

Output输出

  stop_code stop_name                           route
0   92072   Eastbound @ 257 Kingston Road East  [900]
1   1590    Kingston Westbound @ Wicks          [900]
2   2218    Kingston Eastbound @ Wicks          [900]
3   93152   Salem Northbound @ Kingston         [900, 215]
4   728     Kingston Rd. @ Salem Rd.            [92]
5   92071   Salem Southbound @ Twilley          [224, 215]
6   92054   Northbound @ 133 Salem              [215, 224]

If you want exactly as you have in your expected output, do as below.如果您想要完全按照您的预期输出,请执行以下操作。 However, this will probably be slower.但是,这可能会更慢。

df.groupby(['stop_code','stop_name'],sort=False)['route'].apply(lambda x: ','.join([str(a) for a in x])).reset_index()

Output输出

 stop_code  stop_name                            route
0   92072   Eastbound @ 257 Kingston Road East   900
1   1590    Kingston Westbound @ Wicks           900
2   2218    Kingston Eastbound @ Wicks           900
3   93152   Salem Northbound @ Kingston          900,215
4   728     Kingston Rd. @ Salem Rd.             92
5   92071   Salem Southbound @ Twilley           224,215
6   92054   Northbound @ 133 Salem               215,224

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM