Numpy - 从数组创建摘要 df

Question

I have a 2d array of 20,10, values ranging from 0 to 12 (created from a dataframe).我有一个 20,10 的二维数组，值范围从 0 到 12（从数据帧创建）。

arr = np.random.choice(np.arange(0, 13), size=(20,10))
array([[0,  9,  9,  7,  6,  2,  6,  4,  4,  3],
       [0,  2,  1,  7,  1,  0,  2,  6,  6,  2],
       [7,  3,  9,  8,  9,  7,  1, 10,  4,  2],
       [0,  7,  0,  1,  4,  5,  8,  4,  2,  2],
       [5,  2, 12,  3, 12,  2,  7, 12,  4, 12],
       [0, 11,  0, 10,  7,  4, 12, 11, 11,  4],
       [0,  9,  9,  8,  5, 11,  7,  6, 10,  7],
       [0,  9,  0, 10, 11,  1,  5, 10,  8, 10],
       [3, 11,  4,  7,  7,  8, 10, 11,  5, 12],
       [0,  5,  0,  8,  1,  5,  1, 11,  9,  1],
       [0,  8,  6, 12, 11,  1,  4, 11,  4,  1],
       [2, 10,  5,  5,  7,  9, 11,  6, 12, 10],
       [9,  8, 11,  4, 10,  1, 10, 12,  0,  3],
       [0,  7, 10,  8,  2, 10,  5,  7,  9,  6],
       [0,  9,  6,  9,  1, 12,  4,  1,  8,  2],
       [8, 12, 10, 12,  8,  2,  3,  0, 11,  4],
       [6,  7, 11, 12,  8,  7,  1,  9,  9,  8],
       [0,  4,  0,  8,  9,  7,  1,  1,  3,  5],
       [0,  8,  1, 11,  2, 12,  6, 11, 12, 10],
       [0,  7,  3,  8,  3,  3,  7,  1,  9,  9]])

Desired output is a dataframe with rows and columns going from 0 to 12. And the cell values should be the count of number of consecutive times a value changes from one value to another in all rows of the array.所需的 output 是一个 dataframe，行和列从 0 到 12。单元格值应该是数组所有行中值从一个值更改为另一个值的连续次数的计数。

    0   1   2   3   4   5   6   7   8   9   10  11  12
0   25  20  30                                      
1                                                   
2                                                   
3                                                   
4   2   2   5           4                           
5                                                   
6                                                   
7                                                   
8                                                   
9                                                   
10                                                  
11                                                  
12

(Not true output) （不是真正的输出）

For example, in this array, 0 to 9 change occurs 4 times.例如，在这个数组中，0 到 9 的变化出现了 4 次。 And 10 to 12 change occurs 2 times: 10 到 12 的变化发生 2 次：

Answer 1

If you use a Counter from collections library you can solve it like this如果您使用 collections 库中的计数器，您可以这样解决

import numpy as np
from collections import Counter

max_number = 12

np.random.choice(np.arange(0, max_number+1), size=(20,10))

index = np.array(list((i, i+1) for i in range(array.size-1)))

counter = Counter(map(tuple, tuple(array.reshape(-1)[index].tolist())))

result = np.zeros(shape=(max_number,max_number))

for i in range(max_number):
    for j in range(max_number):
        result[i,j] = counter[(i,j)]

result

Answer 2

This is my solution.这是我的解决方案。 Can it be improved?可以改进吗？

max_ = arr.max()

shape_ = np.arange(arr.min(), arr.max() + 1)
df = pd.DataFrame(index=shape_, columns=shape_)
df.fillna(0, inplace=True)

for row in arr:
    for i in range(len(row) - 1):
        df[row[i]][row[i + 1]] += 1

df.T
>>
    0   1   2   3   4   5   6   7   8   9   10  11  12
0   0   1   2   1   1   1   0   3   4   4   2   2   0
1   1   1   0   1   2   2   0   1   1   2   2   2   1
2   0   1   1   1   0   0   2   1   0   0   2   0   2
3   1   0   0   1   0   1   0   1   1   1   0   1   1
4   1   2   2   1   1   1   0   1   0   0   1   1   2
5   1   1   1   0   0   1   0   2   1   0   1   1   1
6   0   0   2   0   1   0   1   1   0   1   1   1   2
7   1   5   0   2   1   0   2   1   1   2   1   1   1
8   0   2   3   1   1   1   1   1   0   2   2   1   1
9   1   2   0   0   0   0   2   3   4   4   0   1   0
10  0   1   0   0   1   2   0   2   2   0   0   2   2
11  1   2   1   0   5   1   1   1   0   1   0   1   2
12  1   0   1   1   2   0   1   0   2   0   3   2   0

Numpy - 从数组创建摘要 df

问题描述

2 个解决方案

解决方案1
1 2021-08-19 11:45:13

解决方案2
0 已采纳 2021-08-19 12:37:11

Numpy - 从数组创建摘要 df

问题描述

2 个解决方案

解决方案1 1 2021-08-19 11:45:13

解决方案2 0 已采纳 2021-08-19 12:37:11

解决方案1
1 2021-08-19 11:45:13

解决方案2
0 已采纳 2021-08-19 12:37:11