将一个组划分为 n 并在 python 中为每个组添加块号

Question

I have the following table:我有下表：

ColumnA A栏	ColumnB B栏
A一个	12 12
B乙	32 32
C C	44 44
D D	76 76
E乙	99 99
F F	123 123
G G	65 65
H H	87 87
I我	76 76
J Ĵ	231 231
k ķ	80 80
l l	55 55
m米	27 27
n n	67 67

I would like to divide this table in to 'n' (n = 4, here) groups and add another column with group name.我想将此表划分为“n”（n = 4，此处为）组，并添加另一列与组名。 The output should look like the following: output 应如下所示：

ColumnA A栏	ColumnB B栏	ColumnC C栏
A一个	12 12	1 1
B乙	32 32	1 1
C C	44 44	1 1
D D	76 76	1 1
E乙	99 99	2 2
F F	123 123	2 2
G G	65 65	2 2
H H	87 87	2 2
I我	76 76	3 3
J Ĵ	231 231	3 3
k ķ	80 80	3 3
l l	55 55	4 4
m米	27 27	4 4
n n	67 67	4 4

What I tried so for?我这么努力是为了什么？

TGn = 4
idx = set(df.index // TGn)

treatment_groups = [i for i in range(1, n+1)]
df['columnC'] = (df.index // TGn).map(dict(zip(idx, treatment_groups)))

This does not split the group properly, not sure where I went wrong.这不能正确拆分组，不确定我哪里出错了。 How do I correct it?我该如何纠正？

Answer 1

Assuming that your sample size is exactly divided by n (ie sample_size%n is 0):假设您的样本大小正好除以 n（即sample_size%n为 0）：

import numpy as np
groups = range(1,n+1)

df['columnC'] = np.repeat(groups,int(len(df)/n))

If your sample size is not exactly divided by n (ie sample_size%n is not 0):如果您的样本大小未完全除以 n（即sample_size%n不为 0）：

# Assigning the remaining rows to random groups
df['columnC'] = np.concatenate(
                [np.repeat(groups,int(len(df)/n)), 
                 np.random.randint(1, high=n, size=int(len(df)%n), dtype=int)])

# Assigning the remaining rows to group 'm'
df['columnC'] = np.concatenate(
                [np.repeat(groups,int(len(df)/n)), 
                 np.repeat([m],int(len(df)%n)), dtype=int)])

将一个组划分为 n 并在 python 中为每个组添加块号

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-03-22 19:35:19

将一个组划分为 n 并在 python 中为每个组添加块号

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-03-22 19:35:19

解决方案1
1 已采纳 2021-03-22 19:35:19