Numpy和Pandas由Bin重复值

Question

I have a Dataframe or Numpy array with ascending group numbers, and I would like to assign a list of values (with equal length to the unique number of groups) repeated per group. 我有一个带有升序组号的Dataframe或Numpy数组，我想为每个组重新分配一个值列表（每组的唯一数量相等）。

ID - Group
0  -  0
1  -  0
2  -  1
3  -  1
4  -  1
5  -  2
6  -  2
7  -  3

Values to assign: 要分配的值：

[4, 2, 7, 8] # 4 maps to group 0, 2 maps to group 1 etc

Output: 输出：

ID - Group  - Val
0  -  0     -  4
1  -  0     -  4
2  -  1     -  2
3  -  1     -  2
4  -  1     -  2
5  -  2     -  7
6  -  2     -  7
7  -  3     -  8

Appreciate any suggestions, preferably without looping if there are elegant ways/native functions to solve that (looking for both Numpy and Pandas solution). 感谢任何建议，如果有优雅的方式/本机功能来解决这个问题，最好不要循环（寻找Numpy和Pandas解决方案）。

Answer 1

Setup : 设置：

a = np.array([4, 2, 7, 8])
v = df.Group.values
dct = {}

Option 1 选项1
Using numpy indexing. 使用numpy索引。 (This solution assumes your groups range from 0-N : （此解决方案假设您的组范围为0-N ：

dct['numpy_indexing'] = a[v]

Option 2 选项2
Using np.repeat . 使用np.repeat 。 (This solution assumes your groups are not interlaced): （此解决方案假设您的组不是隔行扫描的）：

dct['numpy_repeat'] = np.repeat(a, np.bincount(v))

Option 3 选项3
Using map . 使用map 。 This approach will be slower than the others, but is a bit more flexible, as it allows for interlaced groups and non-linear groups: 这种方法比其他方法慢，但更灵活，因为它允许隔行扫描组和非线性组：

d = dict(zip(np.unique(v), a))

dct['pandas_map'] = df.Group.map(d)

Output 产量

df.assign(**dct)

   ID  Group  numpy_indexing  numpy_repeat  pandas_map
0   0      0               4             4           4
1   1      0               4             4           4
2   2      1               2             2           2
3   3      1               2             2           2
4   4      1               2             2           2
5   5      2               7             7           7
6   6      2               7             7           7
7   7      3               8             8           8

Numpy和Pandas由Bin重复值

问题描述

1 个解决方案

解决方案1
4 已采纳 2018-10-31 16:06:05

Numpy和Pandas由Bin重复值

问题描述

1 个解决方案

解决方案1 4 已采纳 2018-10-31 16:06:05

解决方案1
4 已采纳 2018-10-31 16:06:05