[英]Create a new numpy array from elements of another numpy array
I've been strugling to create a sub-array from specific elements of a first array.我一直在努力从第一个数组的特定元素创建一个子数组。
Given a first array that looks like this (it commes from a txt file with two lines:给定一个看起来像这样的第一个数组(它来自一个包含两行的 txt 文件:
L1,(B:A:3:1),(A:C:5:2),(C:D:2:3)
L2,(C:E:2:0.5),(E:F:10:1),(F:D:0.5:0.5)):
code代码
toto = pd.read_csv("bd_2_test.txt",delimiter=",",header=None,names=["Line","1rst","2nd","3rd"])
matrix_toto = toto.values
matrix_toto
result结果
Line 1rst 2nd 3rd
0 L1 (B:A:3:1) (A:C:5:2) (C:D:2:3)
1 L2 (C:E:2:0.5) (E:F:10:1) (F:D:0.5:0.5)
how can I transform it into an array like this one?我怎样才能把它变成这样的数组?
array([['B', 'A', 3, 1],
['A', 'C', 5, 2],
['C', 'D', 2, 3],
['C', 'E', 2, 0.5],
['E', 'F', 10, 1],
['F', 'D', 0.5, 0.5]], dtype=object)
I tried vectorizing but I get each second element of the array.我尝试矢量化,但我得到了数组的每个第二个元素。
np.vectorize(lambda s: s[1])(matrice_toto)
array([['1', 'B', 'A', 'C'],
['2', 'C', 'E', 'F']], dtype='<U1')
I am not sure what you are trying is the optimal solution to your real problem.我不确定您正在尝试的是解决您实际问题的最佳解决方案。 But, well, staying as close as possible to your initial try
但是,好吧,尽可能接近您的初始尝试
# We need regular expression to transform a string of ``"(x:y:z:t)"`` into an array``["x","y","z","t"]``
import re
# tr does that transformation
tr=lambda s: np.array(re.findall('\(([^:]*):([^:]*):([^:]*):([^:]*)\)', s)[0])
# Alternative version, without re (and maybe best, I've benchmarked it)
tr=lambda s: s[1:-1].split(':') # s[1:-1] remove 1st and last char, so parenthesis. And .split(':') creates an array for substring separated by colons.
# trv is the vectorization of tr
# We need the signature, because the return type is an array itself.
trv=np.vectorize(tr, signature='()->(n)')
result=trv(matrix_toto[:,1:].flatten())
Note that matrix_toto[:,1:]
is your matrix, without the 1st column (the line name).请注意,
matrix_toto[:,1:]
是您的矩阵,没有第一列(行名)。 And matrix_toto[:,1:].flatten()
flatten it, so we have 1 entry per cell of your initial array (excluding line name).并且
matrix_toto[:,1:].flatten()
将其展平,因此初始数组的每个单元格都有 1 个条目(不包括行名)。 Each of those cell is a string "(x:y:z:t)"
.这些单元格中的每一个都是一个字符串
"(x:y:z:t)"
。 Which is transformed by trv into an array. trv 将其转换为数组。
Result is结果是
array([['B', 'A', '3', '1'],
['A', 'C', '5', '2'],
['C', 'D', '2', '3'],
['C', 'E', '2', '0'],
['E', 'F', '1', '1'],
['F', 'D', '0', '0']], dtype='<U1')
Obviously you need only one of the 2 lines tr=...
.显然,您只需要两行中的一行
tr=...
。 I let both in the code, because I don't know the exact specification of those (x:y:z:t)
patterns, so you may need to adapt from one of the 2 versions.我在代码中同时使用了这两种模式,因为我不知道这些
(x:y:z:t)
模式的确切规范,因此您可能需要从这两个版本中的一个进行改编。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.