简体   繁体   English

matplotlib:如何有条件地绘制二维数组中的直方图

[英]matplotlib: How to conditionally plot a histogram from a 2d array

I have a 2D array, where I am trying to plot a histogram of all the rows in one column, given a condition in another column. 我有一个2D数组,在这里我试图在给定另一列的条件的情况下绘制一列中所有行的直方图。 I am trying to select subdata in the plt.hist() command, to avoid making numerous subarrays, which I already know how to do. 我正在尝试在plt.hist()命令中选择子数据,以避免制作许多子数组,而我已经知道该怎么做。 For example if 例如,如果

a_long_named_array = [1, 5]
                     [2, 6]
                     [3, 7]

I could create a subset of my array such that the 1st column is greater than 5 by writing 我可以通过写以下内容来创建数组的子集,使第一列大于5

a_long_named_subarray = a_long_named_array[a_long_named_array[:,1] > 5]

How do I plot this subdata without making the aforementioned subarray? 如何在不制作上述子数组的情况下绘制此子数据? Please see below. 请看下面。

import numpy as np
import matplotlib.pyplot as plt

#Generate 2D array
arr = np.array([np.random.random_integers(0,10, 10), np.arange(0,10)])

#Transpose it
arr = arr.T

#----------------------------------------------------------------------------
#Plotting a Histogram: This works
#----------------------------------------------------------------------------

#Plot all the rows of the 0'th column
plt.hist(arr[:,0])
plt.show()

#----------------------------------------------------------------------------
#Plotting a conditional Histogram: This is what I am trying to do. This Doesn't work.
#----------------------------------------------------------------------------

#Plot all the rows of the 0th column where the 1st column is some condition (here > 5)
plt.hist(arr[:,0, where 1 > 5])
plt.show()

quit()

You just need to apply the boolean index ( whatever > 5 returns a boolean array) to the first dimension. 您只需要将布尔值索引( whatever > 5返回布尔值数组)应用于第一个维度。

You're currently trying to index the array along the third dimension with the boolean mask. 您目前正在尝试使用布尔蒙版沿第三个维度索引数组。 The array is only 2D, so you're probably getting an IndexError . 该数组只有2D,因此您可能会得到IndexError (Most likely " IndexError: too many indices ".) (很可能是“ IndexError: too many indices ”。)

For example: 例如:

import numpy as np

# Your example data
arr = np.array([np.random.random_integers(0,10, 10), np.arange(0,10)])
arr = arr.T

# What you want:
print arr[arr[:,1] > 5, 0]

Basically, in place of the : , you just put in the boolean mask ( something > 5 ). 基本上,代替: ,您只需放入布尔掩码( something > 5 )。 You might find it clearer to write: 您可能会发现写起来更清晰:

mask = arr[:,1] > 5
result = arr[mask, 0]

Another way of thinking of this is: 另一种思考方式是:

second_column = arr[:,1]
first_column = arr[:,0]
print first_column[second_column > 5]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM