简体   繁体   中英

Matplotlib is graphing a 'zig zag' line when trying to graph polynomial

My line in matplotlib is the correct shape, however, it is made up of zig zagging lines.

I've tried restarting and graphing the same equation on desmos. The equation on desmos looks exactly how I want it to. I think this is a matplotlib issue.

#imports
import numpy as np
import pandas as pd
import seaborn as sns; sns.set() # just makes your plots look prettier run 'pip install seaborn'
import matplotlib.pyplot as plt

from IPython.core.pylabtools import figsize
figsize(15, 7)

from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.3)

noise = np.random.randn(100)

x = np.linspace(-2,2, 100)
y = x + noise + np.random.randn()*2 + x**2

plt.scatter(x, y); plt.show()

#pre processing
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3)

#initializing m and b variables
current_z_val = 0.1
current_m_val = 0.1
current_b_val = 0.1

#setting # of iterations
iterations = 5

#calculating length of examples for functions used below
n = len(x_train)

#learning rate
learning_rate = 0.01

#plot the data and estimates
plt.scatter(x_train,y_train)
plt.title("Example data and hypothesis lines")
plt.xlabel('X Axis')
plt.ylabel('Y Axis')

cost_history = []

#main graident descent loop
for i in range(iterations):

  #creating the hypothesis using y=z^2 + mx+b form
  y_hypothesis = (current_z_val * (x_train**2)) + (current_m_val * x_train) + current_b_val

  #calculating the derivatives from the image embedded above in code
  z_deriv = -(2/n)*sum(y_train-y_hypothesis)
  m_deriv = -(2/n)*sum(x_train*(y_train-y_hypothesis))
  b_deriv = -(2/n)*sum(y_train-y_hypothesis)

  #updating m and b values
  current_z_val = current_z_val - (learning_rate * z_deriv)
  current_m_val = current_m_val - (learning_rate * m_deriv)
  current_b_val = current_b_val - (learning_rate * b_deriv)

  #calculate the cost (error) of the model
  cost = (1/n)*sum(y_train-y_hypothesis)**2
  cost_history.append(cost)

  #print the m and b values
  #print("iteration {}, cost {}, m {}, b {}".format(i,cost,current_m_val,current_b_val))
  plt.plot(x_train,y_hypothesis)

plt.show()

#plot the final graph
plt.plot(range(1,len(cost_history)+1),cost_history)
plt.title("Cost at each iteration")
plt.xlabel('Iterations')
plt.ylabel('MSE')

plt.show()

This is what a graph looks like on my plot. And this is what it should look like.

matplotlib plots the point following their order in the list, not their "natural" order given by their magnitude.

I think you should sort x_train before computing y_hypothesis in order to get the function you expect to have.

Note that this is happening in both plt.scatter() and plt.plot() , but you see it only in the latter because while connecting the dots with plt.plot() you actually see the sequence.

The function train_test_split will randomly select xtrain and xtest , due to which your x will be shuffled. Matplotlib will not be able to plot a line if your x is not in order.

Use shuffle=False in the following line. That should do that plot right.

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3, shuffle=False)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM