I am using np.random.choice to construct a histogram of the sum of throwing 2 evenly weighted dice. However, when I run the code, the values for 7, which should have the most returns, are missing.
import numpy as np
import matplotlib.pyplot as plt
values = [1, 2, 3, 4, 5, 6]
z = 1/6
x = np.random.choice(values, 1000000, p=[z, z, z, z, z, z])
y = np.random.choice(values, 1000000, p=[z, z, z, z, z, z])
plt.hist(x + y, 12, color="green", edgecolor='black', linewidth=1.2, label="Uniform Dist", rwidth=.75)
plt.show()
Any suggestions on what is going wrong?
The problem is that plt.hist
binning algorithm is suitable for real values, not for integer (discrete) values.
Let's see bins proposed by matplotlib:
n, bins, _ = plt.hist(x + y, bb, color="green", edgecolor='black', linewidth=1.2, label="Uniform Dist", rwidth=.75)
The bins
is:
array([ 2. , 2.83333333, 3.66666667, 4.5 , 5.33333333,
6.16666667, 7. , 7.83333333, 8.66666667, 9.5 ,
10.33333333, 11.16666667, 12. ])
Sixth bar has range [bins[5], bins[6])
equal to [6.17, 7.00)
- notice, that it is half-open. Therefore no integer belongs to this range.
The solution is to manually set bins:
values = x + y
bins = np.arange(np.min(values) - .5, np.max(values) + 0.5, 1)
plt.hist(values, bins, color="green", edgecolor='black', linewidth=1.2, label="Uniform Dist", rwidth=.75)
bins
is equal to array([0.5, 1.5, 2.5, 3.5, 4.5, 5.5, ..., 10.5, 11.5, 12.5])
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.