简体   繁体   English

将数据帧的切片添加到新列中的另一个数据帧

[英]Adding slices of a dataframe to another dataframe in a new column

I have 2 dataframes.我有 2 个数据框。 One is empty and the other one contains a lot of rows.一个是空的,另一个包含很多行。 I want to group the dataframe with values and then slice the first 3 rows of each group and add them to the empty dataframe.我想用值对数据帧进行分组,然后将每组的前 3 行切片并将它们添加到空数据帧中。 I want each new 3 rows to be put into a new column.我希望将每个新的 3 行放入一个新列中。

I have tried, concat, join, append.. but I cannot figure out how to...我试过,连接,加入,追加......但我不知道如何......

My code so far:到目前为止我的代码:

df = pd.Dataframe()
df2 = pd.DataFrame({'C': [20, 20, 20, 20, 10, 10, 10, 30, 30, 30],
                   'D': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]})

df_dictionary = df2.groupby("C")

for key, df_values in df_dictionary:
    df_values = df_values.head(3)
    df = pd.concat(df, df_values["D"], axis=1)
    print(df)

The results would look like this of the empty dataframe:空数据框的结果如下所示:

index   col 1   col 2   col 3
0   1   5   8
1   2   6   9
2   3   7   10

I want to add the first 3 values in the D column for every group to the empty dataframe and put them in a new column every time.我想将每个组的 D 列中的前 3 个值添加到空数据框中,并每次将它们放入一个新列中。

Does anyone have a suggestion?有人有建议吗?

I am using cumcount before pivot我在pivot之前使用cumcount

n=3 
df2.assign(key=df2.groupby('C').cumcount()).pivot(index='key',columns='C',values='D').iloc[:n,:]
Out[730]: 
C     10   20    30
key                
0    5.0  1.0   8.0
1    6.0  2.0   9.0
2    7.0  3.0  10.0

This answer has one requirement: each group must have at least n values per group .这个答案有一个要求:每组必须至少有n

Using head + reshape使用head + reshape


n = 3
u = df2.groupby('C').head(n)['D'].values

pd.DataFrame(u.reshape(-1, n, order='F'), columns=[f'col {i+1}' for i in range(n)])

   col 1  col 2  col 3
0      1      5      8
1      2      6      9
2      3      7     10

My solution utilizes the dictionary returns by groupby.groups to construct new dataframe我的解决方案利用groupby.groups返回的字典来构建新的数据groupby.groups

gb = df2.set_index('D').groupby('C')
pd.DataFrame.from_dict(gb.groups, orient='index').iloc[:,:3].T

Out[2033]:
   10  20  30
0   5   1   8
1   6   2   9
2   7   3  10

Or using head after T或者在T之后使用head

pd.DataFrame.from_dict(gb.groups, orient='index').T.head(3)

Out[2034]:
    10   20    30
0  5.0  1.0   8.0
1  6.0  2.0   9.0
2  7.0  3.0  10.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM