从另一个 numpy 数组的元素创建一个新的 numpy 数组

Question

I've been strugling to create a sub-array from specific elements of a first array.我一直在努力从第一个数组的特定元素创建一个子数组。

Given a first array that looks like this (it commes from a txt file with two lines:给定一个看起来像这样的第一个数组（它来自一个包含两行的 txt 文件：

L1,(B:A:3:1),(A:C:5:2),(C:D:2:3)
L2,(C:E:2:0.5),(E:F:10:1),(F:D:0.5:0.5)):

code代码

toto = pd.read_csv("bd_2_test.txt",delimiter=",",header=None,names=["Line","1rst","2nd","3rd"])
matrix_toto = toto.values
matrix_toto

result结果

    Line    1rst    2nd 3rd
0   L1  (B:A:3:1)   (A:C:5:2)   (C:D:2:3)
1   L2  (C:E:2:0.5) (E:F:10:1)  (F:D:0.5:0.5)

how can I transform it into an array like this one?我怎样才能把它变成这样的数组？

array([['B', 'A', 3, 1],
       ['A', 'C', 5, 2],
       ['C', 'D', 2, 3],
       ['C', 'E', 2, 0.5],
       ['E', 'F', 10, 1],
       ['F', 'D', 0.5, 0.5]], dtype=object)

I tried vectorizing but I get each second element of the array.我尝试矢量化，但我得到了数组的每个第二个元素。

np.vectorize(lambda s: s[1])(matrice_toto)

array([['1', 'B', 'A', 'C'],
       ['2', 'C', 'E', 'F']], dtype='<U1')

Answer 1

I am not sure what you are trying is the optimal solution to your real problem.我不确定您正在尝试的是解决您实际问题的最佳解决方案。 But, well, staying as close as possible to your initial try但是，好吧，尽可能接近您的初始尝试

# We need regular expression to transform a string of ``"(x:y:z:t)"`` into an array``["x","y","z","t"]``
import re
# tr does that transformation
tr=lambda s: np.array(re.findall('\(([^:]*):([^:]*):([^:]*):([^:]*)\)', s)[0])
# Alternative version, without re (and maybe best, I've benchmarked it)
tr=lambda s: s[1:-1].split(':') # s[1:-1] remove 1st and last char, so parenthesis. And .split(':') creates an array for substring separated by colons.
# trv is the vectorization of tr
# We need the signature, because the return type is an array itself.
trv=np.vectorize(tr, signature='()->(n)')
result=trv(matrix_toto[:,1:].flatten())

Note that matrix_toto[:,1:] is your matrix, without the 1st column (the line name).请注意， matrix_toto[:,1:]是您的矩阵，没有第一列（行名）。 And matrix_toto[:,1:].flatten() flatten it, so we have 1 entry per cell of your initial array (excluding line name).并且matrix_toto[:,1:].flatten()将其展平，因此初始数组的每个单元格都有 1 个条目（不包括行名）。 Each of those cell is a string "(x:y:z:t)" .这些单元格中的每一个都是一个字符串"(x:y:z:t)" 。 Which is transformed by trv into an array. trv 将其转换为数组。

Result is结果是

array([['B', 'A', '3', '1'],
       ['A', 'C', '5', '2'],
       ['C', 'D', '2', '3'],
       ['C', 'E', '2', '0'],
       ['E', 'F', '1', '1'],
       ['F', 'D', '0', '0']], dtype='<U1')

Obviously you need only one of the 2 lines tr=... .显然，您只需要两行中的一行tr=... 。 I let both in the code, because I don't know the exact specification of those (x:y:z:t) patterns, so you may need to adapt from one of the 2 versions.我在代码中同时使用了这两种模式，因为我不知道这些(x:y:z:t)模式的确切规范，因此您可能需要从这两个版本中的一个进行改编。

从另一个 numpy 数组的元素创建一个新的 numpy 数组

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-10-02 16:26:27

从另一个 numpy 数组的元素创建一个新的 numpy 数组

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-10-02 16:26:27

解决方案1
1 已采纳 2022-10-02 16:26:27