bin counts in stacked histogram (weighted) with x-coordinate greater than certain value

Question

Lets say I have two datasets, and then I plot a stacked histograms of both datasets with some weight. Now, can I know what is the total bin counts for data elements greater than certain number (ie for x-coordinate greater than a certain value). To illustrate my question, I have done the following

import matplotlib.pyplot as plt
import numpy as np

data1 = np.random.normal(0,0.6,1000)
data2 = np.random.normal(0,1.4,1000)

weight1 = np.array([0.5]*len(data1))
weight2 = np.array([0.9]*len(data2))

hist = plt.hist((data1,data2),weights=(weight1,weight2),stacked=True,range=(-5,5))

plt.show()

在此处输入图片说明

Now, how would I know the the bin counts, say for x greater than -2?

As of now, to get that answer, I was doing the following

n1,_,_ = plt.hist((data1,data2),weights=(weight1,weight2),stacked=False,range=(-2,10000))
bin_counts=sum(sum(n1))
print(bin_counts)

Here, I choose the max value in range to be some crazily large number, so that I get all the bin counts for x=-2 and greater.

Is there any more efficient way than this?

Also, what would be the way to obtain the bin_counts for a variable x , where x varies from the minimum value of x-coordinate to maximum value of x-coordinate in some steps?

Any help will be greatly appreciated!

Thanks much!

Answer 1

You could do the following:

#in your case n is going to be a list of arrays, because you have 2 histograms
n,bins,_ = plt.hist(...)
#get a list of lists of counts for bin values over x
n_over_x = [[val for val,bin in zip(selected_cnt, bins) if bin > x] for selected_cnt in n]
#sum up list of lists
result = sum([sum(part_list) for part_list in n_over_x])

Answer 2

here's what I came up with,

def my_range(start, end, step):
    while start <= end:
        yield start
        start += step

b_counts=[0]*len(data1) #here b_counts is the normalized events (i mean normalized according to the weights)
value=[0]*len(data1)
bin_min=-5
bin_max=10
bin_step=1
count_max = (bin_max-bin_min)/bin_step

for i in my_range(bin_min,count_max,1):
    n1,_,_ = plt.hist((data1,data2),weights=(weight1,weight2),stacked=False,range=(i*bin_step,10000))
    b_counts[i] = sum(sum(n1))
    value[i] = i*bin_step #here value is exactly equal to "i", but I am writing this for a general case
    print(b_counts[i],value[I])

I do believe that this gives me the events (in the histogram) in the range (value,10000), where the value is the variable

bin counts in stacked histogram (weighted) with x-coordinate greater than certain value

Question

2 answers

solution1
0 2018-07-12 07:20:11

solution2
0 2018-07-12 08:56:17

bin counts in stacked histogram (weighted) with x-coordinate greater than certain value

Question

2 answers

solution1 0 2018-07-12 07:20:11

solution2 0 2018-07-12 08:56:17

solution1
0 2018-07-12 07:20:11

solution2
0 2018-07-12 08:56:17