Concat（）python3.0的替代组

Question

My goal here is to concat() alternate groups between two dataframe. 我的目标是在两个数据帧之间concat（）备用组。

desired result : 预期结果：

 group  ordercode   quantity
0  A            1
    B           1
    C       1
    D          1
0   A          1
    B           3       

1   A           1
    B           2
    C           1
1  A        1
    B           1
    C           2

My dataframe: 我的数据框：

import pandas as pd
df1=pd.DataFrame([[0,"A",1],[0,"B",1],[0,"C",1],[0,"D",1],[1,"A",1],[1,"B",2],[1,"C",1]],columns=["group","ordercode","quantity"])
df2=pd.DataFrame([[0,"A",1],[0,"B",3],[1,"A",1],[1,"B",1],[1,"C",2]],columns=["group","ordercode","quantity"])
print(df1)
print(df2)

I have used dfff=pd.concat([df1,df2]).sort_index(kind="merge") 我用过dfff=pd.concat([df1,df2]).sort_index(kind="merge")

but I have got the below result: 但我得到以下结果：

    group   ordercode   quantity
0   0   A   1
0   0   A   1
1       B   1
1       B   3
2       C   1
3       D   1
4   1   A   1
4   1   A   1
5       B   2
5       B   1
6       C   1
6       C   2

You can see here the concatenate is formed between each rows not by group. 您可以在此处看到不是按组而是在每行之间形成连接。 It has to print like 它必须像打印

group 0 of df1 group0 of df2 group1 of df1 group1 of df2 and so on group 0 of df1 group0 of df2 group1 of df1 group1 of df2和so on

Note: I have created these DataFrame using groupby() function 注意：我已经使用groupby()函数创建了这些DataFrame

  df = pd.DataFrame(np.concatenate(df.apply(lambda x: [x[0]] * x[1], 1).as_matrix()), 
              columns=['ordercode'])
df['quantity'] = 1
df['group'] = sorted(list(range(0, len(df)//3, 1)) * 4)[0:len(df)]


df=df.groupby(['group', 'ordercode']).sum()

Question: 题：

Where I went wrong? 我哪里出问题了？ Its sorting out by taking index 通过index排序

I have used .set_index("group") but It didnt work either. 我用过.set_index("group")但它也没有用。

Answer 1

Use cumcount for helper column used for sorting by sort_values : 使用cumcount作为用于按sort_values排序的帮助程序列：

df1['g'] = df1.groupby('ordercode').cumcount()
df2['g'] = df2.groupby('ordercode').cumcount()

dfff = pd.concat([df1,df2]).sort_values(['group','g']).reset_index(drop=True)
print (dfff)
    group ordercode  quantity  g
0       0         A         1  0
1       0         B         1  0
2       0         C         1  0
3       0         D         1  0
4       0         A         1  0
5       0         B         3  0
6       1         C         2  0
7       1         A         1  1
8       1         B         2  1
9       1         C         1  1
10      1         A         1  1
11      1         B         1  1

and last remove column: 最后删除列：

dfff = dfff.drop('g', axis=1)

Concat（）python3.0的替代组

问题描述

1 个解决方案

解决方案1
3 已采纳 2018-10-10 08:48:58

Concat（）python3.0的替代组

问题描述

1 个解决方案

解决方案1 3 已采纳 2018-10-10 08:48:58

解决方案1
3 已采纳 2018-10-10 08:48:58