简体   繁体   English

numpy - 如何为数组第一列中的每个元素添加一个值?

[英]numpy - how to add a value to every element in the first column of an array?

I have an array like this: 我有这样一个数组:

array([('6506', 4.6725971801473496e-25, 0.99999999995088695),
       ('6601', 2.2452745388799898e-27, 0.99999999995270605),
       ('21801', 1.9849650921836601e-31, 0.99999999997999001), ...,
       ('45164194', 1.0413482803123399e-24, 0.99999999997453404),
       ('45164198', 1.09470356446595e-24, 0.99999999997635303),
       ('45164519', 3.7521365799080699e-24, 0.99999999997453404)], 
      dtype=[('pos', '|S100'), ('par1', '<f8'), ('par2', '<f8')])

And I want to turn it into this: (adding a prefix '2R' onto each value in the first column) 我想把它变成这样:(在第一列的每个值上添加前缀'2R')

array([('2R:6506', 4.6725971801473496e-25, 0.99999999995088695),
       ('2R:6601', 2.2452745388799898e-27, 0.99999999995270605),
       ('2R:21801', 1.9849650921836601e-31, 0.99999999997999001), ...,
       ('2R:45164194', 1.0413482803123399e-24, 0.99999999997453404),
       ('2R:45164198', 1.09470356446595e-24, 0.99999999997635303),
       ('2R:45164519', 3.7521365799080699e-24, 0.99999999997453404)], 
      dtype=[('pos', '|S100'), ('par1', '<f8'), ('par2', '<f8')])

I looked up some stuff about nditer (but I want to support earlier versions of numpy.) Also I'm reading one should avoid iteration. 我查了一些关于nditer的东西(但是我想支持早期版本的numpy。)另外我正在读一个应该避免迭代。

Using numpy.core.defchararray.add : 使用numpy.core.defchararray.add

>>> from numpy import array
>>> from numpy.core.defchararray import add
>>>
>>> xs = array([('6506', 4.6725971801473496e-25, 0.99999999995088695),
...             ('6601', 2.2452745388799898e-27, 0.99999999995270605),
...             ('21801', 1.9849650921836601e-31, 0.99999999997999001),
...             ('45164194', 1.0413482803123399e-24, 0.99999999997453404),
...             ('45164198', 1.09470356446595e-24, 0.99999999997635303),
...             ('45164519', 3.7521365799080699e-24, 0.99999999997453404)],
...            dtype=[('pos', '|S100'), ('par1', '<f8'), ('par2', '<f8')])
>>> xs['pos'] = add('2R:', xs['pos'])
>>> xs
array([('2R:6506', 4.67259718014735e-25, 0.999999999950887),
       ('2R:6601', 2.24527453887999e-27, 0.999999999952706),
       ('2R:21801', 1.98496509218366e-31, 0.99999999997999),
       ('2R:45164194', 1.04134828031234e-24, 0.999999999974534),
       ('2R:45164198', 1.09470356446595e-24, 0.999999999976353),
       ('2R:45164519', 3.75213657990807e-24, 0.999999999974534)],
      dtype=[('pos', 'S100'), ('par1', '<f8'), ('par2', '<f8')])

A simple (albeit perhaps not optimal) solution is just: 一个简单的(尽管可能不是最优的)解决方案就是:

a = np.array([('6506', 4.6725971801473496e-25, 0.99999999995088695),
       ('6601', 2.2452745388799898e-27, 0.99999999995270605),
       ('21801', 1.9849650921836601e-31, 0.99999999997999001),
       ('45164194', 1.0413482803123399e-24, 0.99999999997453404),
       ('45164198', 1.09470356446595e-24, 0.99999999997635303),
       ('45164519', 3.7521365799080699e-24, 0.99999999997453404)],
      dtype=[('pos', '|S100'), ('par1', '<f8'), ('par2', '<f8')])


a['pos'] = [''.join(('2R:',x)) for x in a['pos']]

In [11]: a
Out[11]:
array([('2R:6506', 4.67259718014735e-25, 0.999999999950887),
       ('2R:6601', 2.24527453887999e-27, 0.999999999952706),
       ('2R:21801', 1.98496509218366e-31, 0.99999999997999),
       ('2R:45164194', 1.04134828031234e-24, 0.999999999974534),
       ('2R:45164198', 1.09470356446595e-24, 0.999999999976353),
       ('2R:45164519', 3.75213657990807e-24, 0.999999999974534)],
      dtype=[('pos', 'S100'), ('par1', '<f8'), ('par2', '<f8')])

While I like @falsetru's answer for using core numpy routines, surprisingly, list comprehension seems a bit faster: 虽然我喜欢@fatetru的使用核心numpy例程的答案,但令人惊讶的是,列表理解似乎更快一些:

In [19]: a = np.empty(20000, dtype=[('pos', 'S100'), ('par1', '<f8'), ('par2', '<f8')])

In [20]: %timeit a['pos'] = [''.join(('2R:',x)) for x in a['pos']]
100 loops, best of 3: 11.1 ms per loop

In [21]: %timeit a['pos'] = add('2R:', a['pos'])
100 loops, best of 3: 15.7 ms per loop

Definitely benchmark your own use case and hardware to see which makes more sense for your actual application though. 绝对对您自己的用例和硬件进行基准测试,看看哪个对您的实际应用更有意义。 One of the things I've learned is that in certain situations, basic python constructs can outperform numpy built-ins, depending on the task at hand. 我学到的一件事是,在某些情况下,基本的python构造可以胜过numpy内置函数,具体取决于手头的任务。

Another slightly faster solution is to use list comprehension with + operator. 另一个稍微快一点的解决方案是使用带+运算符的列表理解。 Though I do not understand why it is faster. 虽然我不明白为什么它更快。 But it is definitely very elegant and basic. 但它绝对是非常优雅和基本的。

a['pos'] = ["2R:" + x for x in a['pos']]

Timings: 时序:

%timeit a['pos'] = ["2R:" + x for x in a['pos']]
8.07 ms ± 64.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit a['pos'] = [''.join(('2R:',x)) for x in a['pos']]
9.53 ms ± 391 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit a['pos'] = add('2R:', a['pos'])
14.2 ms ± 337 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

PS: I created the array a using slightly different definition: PS:我创建数组a使用稍微不同的定义:

a = np.empty(20000, dtype=[('pos', 'U5'), ('par1', '<f8'), ('par2', '<f8')])

as if I use type Sxxx for pos , concatenation produces a type error for me. 就好像我使用类型Sxxx for pos ,连接会为我产生类型错误。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何为 Numpy 数组的每个元素添加一个常量值,但对角线元素? - How to add a constant value to every element of a Numpy array but the diagonal elements? 如何向 numpy 数组中的每隔一列添加一个常量值? - How to add a constant value to every other column in a numpy array? 如何为二维数组的指定列中的每个元素添加一个值? - How to add a value to every element in a specified column of a 2D array? 如何在numpy数组的每一列中找到第一个非零值? - How to find first non-zero value in every column of a numpy array? 如何有效地添加或相乘 numpy 数组的每个第 N 个元素? - How to efficiently add or multiply every Nth element of a numpy array? 如何在形状为 (n,m) 的 numpy 数组的选定列中添加元素? - How to add element in a selected column of a numpy array of shape (n,m)? 计算 pandas dataframe 中每一列的第一个值的增长率并返回 Numpy 数组 - Calculate the growth ratio from first value in every column in a pandas dataframe and return a Numpy array 如何将 1D numpy 数组添加到 2D numpy 数组的第一列? - How can I add 1D numpy array to the first column of a 2D numpy array? 按第一列的值拆分 numpy 数组 - Split numpy array by value of the first column 如何测试numpy数组中的每个元素是否被屏蔽 - How to test if every element in a numpy array is masked
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM