[英]Function that checks if there is a column in a 2D array in which all values are equal?
I have a 2D array that includes data about 14 days and the temperature changes every hour during each day (the matrix is 14x24= 336 data points).我有一个二维数组,其中包含大约 14 天的数据,并且每天的温度每小时都在变化(矩阵为 14x24= 336 个数据点)。 I would like to know if there is a function/command that checks if there is a column in the 2D array in which all values are equal?
我想知道是否有一个函数/命令可以检查二维数组中是否存在所有值都相等的列? Thanks!
谢谢!
An alternative could be to use the reduce method of the np.logical_and ufunc.另一种方法是使用 np.logical_and ufunc 的 reduce 方法。 Using the example array from Mark Setchell's answer.
使用 Mark Setchell 的答案中的示例数组。
import numpy as np
arr = np.array([[10.92206418, 9.00678018, 5. , 6.83022007, 16.18869687],
...: [14.98451533, 2.04903653, 5. , 12.49089931, 7.93300109],
...: [ 0.63397121, 5.27492337, 5. , 10.70274734, 18.68862265],
...: [ 7.31692528, 17.98960002, 5. , 13.94986875, 3.83450356],
...: [ 3.20441573, 11.31828108, 5. , 12.7831887 , 6.69083798],
...: [10.52480423, 14.99047775, 5. , 12.18751519, 19.43634789],
...: [15.95100606, 17.74638291, 5. , 8.06684746, 8.06391555],
...: [14.91391738, 12.78786562, 5. , 7.57760045, 19.73240734],
...: [ 2.90594641, 15.00832554, 5. , 2.25471882, 2.3352564 ],
...: [ 7.05680473, 10.68381728, 5. , 8.9835386 , 5.2305576 ],
...: [ 1.32183032, 3.5445554 , 5. , 15.68051617, 13.08684098],
...: [16.78607292, 12.07334951, 5. , 16.97163501, 11.05617307],
...: [18.75894622, 13.1007517 , 5. , 5.91909606, 1.02953968],
...: [14.00847642, 13.69674151, 5. , 13.49089591, 9.30763748]])
np.logical_and.reduce( arr[1:,:] == arr[:-1,:], axis = 0)
# array([False, False, True, False, False])
Breaking down the steps.分解台阶。
temp = arr[0] == arr[1:,:]
# array([[False, False, True, False, False],
# [False, False, True, False, False],
# [False, False, True, False, False],
# [False, False, True, False, False],
# [False, False, True, False, False],
# [False, False, True, False, False],
# [False, False, True, False, False],
# [False, False, True, False, False],
# [False, False, True, False, False],
# [False, False, True, False, False],
# [False, False, True, False, False],
# [False, False, True, False, False],
# [False, False, True, False, False]])
np.logical_and.reduce( temp, axis = 0 ) # Cumulatively ANDs each column.
array([False, False, True, False, False])
Or with floats use isclose
instead of ==
to capture very nearly equal.或者使用浮点数使用
isclose
而不是==
来捕获几乎相等的数据。
np.logical_and.reduce( np.isclose(arr[1:,:],arr[:-1,:]), axis = 0)
Out[6]: array([False, False, True, False, False])
you can try this你可以试试这个
temp_date = [
[13, 19, 10, 18, 14, 12, 20, 12, 19, 17, 11, 12, 20, 11, 15, 19, 15, 11, 13, 19, 15, 12, 13, 13],
[14, 18, 11, 16, 11, 17, 10, 16, 18, 10, 14, 10, 17, 11, 20, 18, 18, 14, 14, 10, 17, 11, 15, 12],
[11, 20, 15, 19, 12, 18, 12, 19, 18, 15, 20, 20, 18, 10, 11, 13, 14, 12, 14, 12, 15, 13, 19, 14],
[19, 11, 12, 19, 20, 14, 13, 16, 20, 20, 11, 18, 12, 19, 13, 14, 13, 11, 17, 20, 18, 14, 11, 18],
[11, 14, 17, 14, 15, 18, 18, 13, 12, 16, 18, 11, 19, 20, 13, 16, 12, 20, 19, 15, 12, 15, 11, 15],
[14, 18, 11, 11, 16, 17, 10, 13, 15, 18, 14, 19, 10, 12, 19, 16, 18, 18, 12, 12, 12, 14, 18, 11], # this
[10, 17, 15, 15, 18, 20, 16, 15, 19, 12, 19, 10, 16, 18, 12, 14, 14, 17, 12, 13, 13, 18, 11, 10],
[15, 19, 11, 16, 15, 10, 11, 19, 20, 11, 10, 16, 11, 16, 18, 12, 20, 10, 20, 13, 14, 20, 19, 10],
[15, 18, 19, 15, 20, 20, 17, 10, 18, 17, 17, 14, 13, 12, 20, 20, 10, 17, 16, 17, 20, 15, 20, 11],
[17, 10, 19, 11, 19, 17, 19, 16, 13, 13, 10, 17, 12, 14, 19, 10, 13, 20, 19, 11, 12, 16, 16, 11],
[11, 11, 11, 17, 20, 20, 11, 10, 19, 18, 16, 15, 19, 16, 19, 19, 12, 15, 19, 19, 20, 11, 19, 17],
[19, 13, 11, 16, 16, 18, 12, 16, 20, 20, 13, 16, 19, 10, 11, 16, 14, 10, 17, 13, 14, 19, 19, 19],
[14, 18, 11, 11, 16, 17, 10, 13, 15, 18, 14, 19, 10, 12, 19, 16, 18, 18, 12, 12, 12, 14, 18, 11], # this
[11, 13, 10, 17, 18, 19, 17, 16, 16, 13, 12, 18, 15, 16, 13, 13, 14, 10, 14, 12, 13, 13, 14, 18],
]
for i in range(14):
for j in range(i + 1, 14):
line_a = temp_date[i]
line_b = temp_date[j]
if line_a == line_b:
print('EQUAL!!! {} and {}'.format(i, j))
Generate some sample temperatures (a bit narrower than yours so we can see them), with one suspicious column生成一些样品温度(比您的稍窄,因此我们可以看到它们),其中包含一个可疑列
# Generate sample temperature and 1 constant column
t = np.random.ranf((14,5)) * 20
t[:, 2] = 5
Looks like this:看起来像这样:
array([[10.92206418, 9.00678018, 5. , 6.83022007, 16.18869687],
[14.98451533, 2.04903653, 5. , 12.49089931, 7.93300109],
[ 0.63397121, 5.27492337, 5. , 10.70274734, 18.68862265],
[ 7.31692528, 17.98960002, 5. , 13.94986875, 3.83450356],
[ 3.20441573, 11.31828108, 5. , 12.7831887 , 6.69083798],
[10.52480423, 14.99047775, 5. , 12.18751519, 19.43634789],
[15.95100606, 17.74638291, 5. , 8.06684746, 8.06391555],
[14.91391738, 12.78786562, 5. , 7.57760045, 19.73240734],
[ 2.90594641, 15.00832554, 5. , 2.25471882, 2.3352564 ],
[ 7.05680473, 10.68381728, 5. , 8.9835386 , 5.2305576 ],
[ 1.32183032, 3.5445554 , 5. , 15.68051617, 13.08684098],
[16.78607292, 12.07334951, 5. , 16.97163501, 11.05617307],
[18.75894622, 13.1007517 , 5. , 5.91909606, 1.02953968],
[14.00847642, 13.69674151, 5. , 13.49089591, 9.30763748]])
Now look at the standard deviation down the columns - it will be zero where there is no variation:现在查看列下的标准偏差 - 在没有变化的情况下它将为零:
np.std(t, axis=0)
That looks like this:看起来像这样:
array([5.97455208, 4.72880646, 0. , 3.97108072, 6.13620197])
Or, calculate the forward differences between each row and the one below:或者,计算每一行与以下行之间的前向差异:
d = t[1:,...] - t[:-1,...]
and take their absolute values:并取它们的绝对值:
np.abs(d)
That looks like this:看起来像这样:
array([[ 4.06245114, 6.95774365, 0. , 5.66067925, 8.25569578],
[14.35054412, 3.22588684, 0. , 1.78815198, 10.75562156],
[ 6.68295407, 12.71467666, 0. , 3.24712141, 14.85411909],
[ 4.11250954, 6.67131894, 0. , 1.16668005, 2.85633441],
[ 7.3203885 , 3.67219667, 0. , 0.59567351, 12.74550991],
[ 5.42620183, 2.75590516, 0. , 4.12066773, 11.37243234],
[ 1.03708868, 4.9585173 , 0. , 0.489247 , 11.6684918 ],
[12.00797096, 2.22045993, 0. , 5.32288163, 17.39715094],
[ 4.15085831, 4.32450827, 0. , 6.72881978, 2.8953012 ],
[ 5.7349744 , 7.13926187, 0. , 6.69697756, 7.85628339],
[15.4642426 , 8.52879411, 0. , 1.29111884, 2.03066791],
[ 1.9728733 , 1.02740219, 0. , 11.05253894, 10.02663339],
[ 4.75046981, 0.59598981, 0. , 7.57179985, 8.27809779]])
Now sum down the columns:现在总结列:
np.sum(np.abs(d), axis=0)
That looks like this:看起来像这样:
array([ 87.07352726, 64.79266138, 0. , 55.73235753,
120.99233952])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.