[英]Numpy and Pandas Repeat Values by Bin
I have a Dataframe or Numpy array with ascending group numbers, and I would like to assign a list of values (with equal length to the unique number of groups) repeated per group. 我有一个带有升序组号的Dataframe或Numpy数组,我想为每个组重新分配一个值列表(每组的唯一数量相等)。
ID - Group
0 - 0
1 - 0
2 - 1
3 - 1
4 - 1
5 - 2
6 - 2
7 - 3
Values to assign: 要分配的值:
[4, 2, 7, 8] # 4 maps to group 0, 2 maps to group 1 etc
Output: 输出:
ID - Group - Val
0 - 0 - 4
1 - 0 - 4
2 - 1 - 2
3 - 1 - 2
4 - 1 - 2
5 - 2 - 7
6 - 2 - 7
7 - 3 - 8
Appreciate any suggestions, preferably without looping if there are elegant ways/native functions to solve that (looking for both Numpy and Pandas solution). 感谢任何建议,如果有优雅的方式/本机功能来解决这个问题,最好不要循环(寻找Numpy和Pandas解决方案)。
Setup : 设置 :
a = np.array([4, 2, 7, 8])
v = df.Group.values
dct = {}
Option 1 选项1
Using numpy
indexing. 使用
numpy
索引。 (This solution assumes your groups range from 0-N
: (此解决方案假设您的组范围为
0-N
:
dct['numpy_indexing'] = a[v]
Option 2 选项2
Using np.repeat
. 使用
np.repeat
。 (This solution assumes your groups are not interlaced): (此解决方案假设您的组不是隔行扫描的):
dct['numpy_repeat'] = np.repeat(a, np.bincount(v))
Option 3 选项3
Using map
. 使用
map
。 This approach will be slower than the others, but is a bit more flexible, as it allows for interlaced groups and non-linear groups: 这种方法比其他方法慢,但更灵活,因为它允许隔行扫描组和非线性组:
d = dict(zip(np.unique(v), a))
dct['pandas_map'] = df.Group.map(d)
Output 产量
df.assign(**dct)
ID Group numpy_indexing numpy_repeat pandas_map
0 0 0 4 4 4
1 1 0 4 4 4
2 2 1 2 2 2
3 3 1 2 2 2
4 4 1 2 2 2
5 5 2 7 7 7
6 6 2 7 7 7
7 7 3 8 8 8
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.