I want a normal curve to fit the histogram I already have. navf2 is a list of normalized random numbers and the histogram is based on those, and I want a curve to show the general trend of the histogram.
while len(navf2)<252:
number=np.random.normal(0,1,None)
navf2.append(number)
bin_edges=np.arange(70,130,1)
plt.style.use(["dark_background",'ggplot'])
plt.hist(navf2, bins=bin_edges, alpha=1)
plt.ylabel("Frequency of final NAV")
plt.xlabel("Ranges")
ymin=0
ymax=100
plt.ylim([ymin,ymax])
plt.show()
Here You go:
=^..^=
from scipy.stats import norm
import numpy as np
import matplotlib.pyplot as plt
# create raw data
data = np.random.uniform(size=252)
# distribution fitting
mu, sigma = norm.fit(data)
# fitting distribution
x = np.linspace(-0.5,1.5,100)
y = norm.pdf(x, loc=mu, scale=sigma)
# plot data
plt.plot(x, y,'r-')
plt.hist(data, density=1, alpha=1)
plt.show()
Output:
Here is a another solution using your code as mentioned in the question. We can achieve the expected result without the use of the scipy library. we will have to do three things, compute the mean of the data set, compute the standard deviation of the set, and create a function that generates the normal or Gaussian curve.
To compute the mean we can use the function within numpy library, ie mu = np.mean(your_data_set_here)
The standard deviation of the set is the square root of the sum of the differences of the values and mean squared https://en.wikipedia.org/wiki/Standard_deviation . We can express it in code as follows, using the numpy library again:
data_set = [] # some data set
sigma = np.sqrt(1/(len(data_set))*sum((data_set-mu)**2))
Finally we have to build the function for the normal curve or Gaussian https://en.wikipedia.org/wiki/Gaussian_function , it relies on both the mean ( mu
) and the standard deviation ( sigma
), so we will use those as parameters in our function:
def Gaussian(x,sigma,mu): # sigma is the standard deviation and mu is the mean
return ((1/(np.sqrt(2*np.pi)*sigma))*np.exp(-(x-mu)**2/(2*sigma**2)))
putting it all together looks like this:
import numpy as np
import matplotlib.pyplot as plt
navf2 = []
while len(navf2)<252:
number=np.random.normal(0,1,None) # since all values will be between 0,1 the bin size doesnt work
navf2.append(number)
navf2 = np.asarray(navf2) # convert to array for better results
mu = np.mean(navf2) #the avg of all values in navf2
sigma = np.sqrt(1/(len(navf2))*sum((navf2-mu)**2)) # standard deviation of navf2
x_vals = np.arange(min(navf2),max(navf2),0.001) # create a flat range based off data
# to build the curve
gauss = [] #store values for normal curve here
def Gaussian(x,sigma,mu): # defining the normal curve
return ((1/(np.sqrt(2*np.pi)*sigma))*np.exp(-(x-mu)**2/(2*sigma**2)))
for val in x_vals :
gauss.append(Gaussian(val,sigma,mu))
plt.style.use(["dark_background",'ggplot'])
plt.hist(navf2, density = 1, alpha=1) # add density = 1 to fix the scaling issues
plt.ylabel("Frequency of final NAV")
plt.xlabel("Ranges")
plt.plot(x_vals,gauss)
plt.show()
Here is a picture of an output:
Hope this helps, I tired to keep it as close to your original code as possible !
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.