简体   繁体   English

减去两个不同形状的 np.ndarray

[英]subtract two np.ndarray with different shapes

ndarray of different shapes and the bigger one has a shape of (22470,2) and it looks like this不同形状的 ndarray 和较大的形状为 (22470,2) ,它看起来像这样

df1

array([[-0.39911392,  0.46759156],
       [ 0.28343494,  0.88479157],
       [-0.0114085 , -1.23768313],
       ...,
       [-0.35930586,  0.54784439],
       [-0.37994004,  0.51332771],
       [-0.36309593,  0.49318486]])

and the small one which represents the outliers of df1 array and its shape is (675,2) and it looks like this代表 df1 数组异常值的小数组及其形状是 (675,2) ,它看起来像这样

df2
array([[-0.04450032,  0.31053589],
       [-0.4320086 ,  0.14815988],
       [-0.07948631, -1.32638555],
       ...,
       [-0.32619787,  0.34910699],
       [-0.50870225, -0.230849  ],
       [-0.43532727,  0.49763502]])

so tried to subtract both of them to have a new array that contains everything in df1 except df2 but it gives me this error所以试图减去它们两个以获得一个新数组,该数组包含 df1 中除 df2 之外的所有内容,但它给了我这个错误

ValueError: operands could not be broadcast together with shapes (22470,2) (675,2) ValueError:操作数无法与形状一起广播 (22470,2) (675,2)

How can I do it in Python?我怎样才能在 Python 中做到这一点?

"Subtracting" two arrays does not perform set operations on the arrays, it simply subtracts the values of one from the values of the other (ie 4 - 3 => 1). “相减”两个数组不会对数组执行集合操作,它只是从另一个的值中减去一个的值(即 4 - 3 => 1)。

What you want to do is basically a set operation.你要做的基本上是一个集合操作。 There is no simple straightforward way to do what you want, how you have presented it (but that doesn't mean it can't be done).没有简单直接的方法来做你想做的事,你如何呈现它(但这并不意味着它不能完成)。 Comparing floating-point numbers for exact equality is a bad idea, instead you will find it much more useful to collect an array of indices of the outliers rather than their values.比较浮点数的精确相等性是一个坏主意,相反,您会发现收集异常值的索引数组而不是它们的值更有用。 Then you can index your array like this question .然后你可以像这个问题一样索引你的数组。

So this would then be something like所以这将是这样的

df1 = array([[1.234, 2.345], [3.3452, 2.456], [5.234, 7.453]])

# This is an array of indices, not float values.
df2 = array([1])

keep = np.ones(len(df1), dtype=bool)
keep[df2] = 0
newdf = df1[keep]

# newdf: [[1.234, 2.345], [5.234, 7.453]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM