导入模块时，出现有关未定义某些变量的NameError（即使它是）

Question

So I am trying to import a module/script (.py file) into a Jupyter notebook, mainly for readability and conciseness. 因此，我试图将模块/脚本（.py文件）导入Jupyter笔记本，主要是为了提高可读性和简洁性。 But then, when I try to run the class in the script, I get the following Error message: 但是，当我尝试在脚本中运行该类时，出现以下错误消息：

NameError                                 Traceback (most recent call last)
<ipython-input-48-4d8cbba46ed0> in <module>()
      8 
      9 test_KMeans = KMeans(k=3, maxiter=1000, tol=1e-9)
---> 10 cluster_center = test_KMeans.fit(X)
     11 clusters = test_KMeans.predict(X)
     12 

~/KMeans.py in fit(self, X)
     42         #Choose k random rows of X as the initial cluster centers.
     43         initial_cluster_centers = []
---> 44 
     45         sample = np.random.randint(0,m,size=k)
     46 

NameError: name 'maxiter' is not defined

Here is my script: 这是我的脚本：

import numpy as np
from sklearn.decomposition import PCA

k = 3
maxiter = 1000
tol = 1e-9

class KMeans:
    """A K-Means object class. Implements basic k-means clustering.

    Attributes:
        k (int): The number of clusters
        maxiter (int): The maximum number of iterations
        tol (float): A convergence tolerance
    """
    def __init__(self, k, maxiter, tol):
        """Set the paramters.

        Parameters:
            k (int): The number of clusters
            maxiter (int): The maximum number of iterations
            tol (float): A convergence tolerance
        """
        k = 3
        maxiter = 1000
        tol = 1e-9

        self.k = k   # Initialize some attributes.
        self.maxiter = maxiter
        self.tol = tol

    def fit(self, X):
        """Accepts an mxn matrix X of m data points with n features.
        """
        m,n = X.shape
        k = 3
        maxiter = 1000
        tol = 1e-9
        self.m = m
        self.n = n

        #Choose k random rows of X as the initial cluster centers.
        initial_cluster_centers = []

        sample = np.random.randint(0,m,size=k)

        initial_cluster_centers = X[sample, :]

        # Run the k-means iteration until consecutive centers are within the convergence tolerance, or until 
        # iterating the maximum number of times.
        iterations = 0
        old_cluster = np.zeros(initial_cluster_centers.shape)
        new_cluster = initial_cluster_centers

        while iterations < maxiter or np.linalg.norm(old_cluster - new_cluster) >= tol:
            #assign each data point to the cluster center that is closest, forming k clusters
            clusters = np.zeros(m)
            for i in range(0,m):
                distances = np.linalg.norm(X[i] - initial_cluster_centers, ord=2, axis=1) # axis=1 was crucial
                cluster = np.argmin(distances)                                            #in getting this to work
                clusters[i] = cluster
            # Store the old/initial centroid values
            old_cluster = np.copy(new_cluster)
            #Recompute the cluster centers as the means of the new clusters
            for i in range(k):
                points = [X[j] for j in range(m) if clusters[j] == i]
                new_cluster[i] = np.mean(points, axis=0)
                #If a cluster is empty, reassign the cluster center as a random row of X.
                if new_cluster[i] == []:
                    new_cluster[i] = X[np.random.randint(0,m,size=1)]
            iterations += 1

        #Save the cluster centers as attributes.
        self.new_cluster = new_cluster

        #print("New cluster centers:\n", new_cluster)

        return new_cluster

    def predict(self, X):
        """Accept an l × n matrix X of data.
        """
        # Return an array of l integers where the ith entry indicates which 
        # cluster center the ith row of X is closest to.
        clusters = np.zeros(self.m)
        for i in range(0,self.m):
            distances = np.linalg.norm(X[i] - self.new_cluster, ord=2, axis=1)
            cluster = np.argmin(distances)
            clusters[i] = cluster

        print("\nClusters:", clusters)

        return clusters

Then I attempt to do the following: 然后，我尝试执行以下操作：

from KMeans import KMeans

X = features_scaled

# k = 3
# maxiter = 1000
# tol = 1e-9

test_KMeans = KMeans(k=3, maxiter=1000, tol=1e-9)
cluster_center = test_KMeans.fit(X)
clusters = test_KMeans.predict(X)

pca = PCA(n_components=2)

pr_components = pca.fit_transform(X) # these are the first 2 principal components

#plot the first two principal components as a scatter plot, where the color of each point is det by the clusters
plt.scatter(pr_components[:,0], pr_components[:,1],
           c=clusters, edgecolor='none', alpha=0.5, #color by clusters
            cmap=plt.cm.get_cmap('tab10', 3)) 
plt.xlabel('principal component 1')
plt.ylabel('principal component 2')
plt.colorbar()
plt.title("K-Means Clustering:")
plt.show()

Upon running the above section of code, I get the NameError I described. 运行以上代码后，我得到了我描述的NameError。 I don't understand why it is telling me that maxiter is not defined. 我不明白为什么它告诉我maxiter 。 You'll see I defined the variables k, maxiter, tol multiple times in the script trying to get it to work, but nothing has. 您会看到我在脚本中多次定义了变量k, maxiter, tol ，试图使其正常工作，但是没有任何作用。 I had self.maxiter and self.tol at one point but that didn't fix it either. 我曾经有过self.maxiter和self.tol ，但这也不能解决。

I know this code works because I have used it multiple times now. 我知道此代码有效，因为我已经多次使用它。 Originally I just defined those variables k, maxiter, and tol.. then instantiated the class and called the fit and predict methods, and since they were stored as attributes with self, everything worked fine. 最初，我只是定义了变量k，maxiter和tol.。然后实例化该类并称为fit和预测方法，由于它们与self一起存储为属性，所以一切正常。 but now that I try to import it as a module I have no idea why it is not working. 但是现在我尝试将其作为模块导入，我不知道为什么它不起作用。

Thanks for your help! 谢谢你的帮助！

EDIT: Here is what my code would look like in a single cell in a Jupyter notebook.. It does run and work in this case: 编辑：这是我的代码在Jupyter笔记本中的单个单元格中的样子。.在这种情况下，它确实可以运行并起作用：

from sklearn.decomposition import PCA

class KMeans:
    """A K-Means object class. Implements basic k-means clustering.

    Attributes:
        k (int): The number of clusters
        maxiter (int): The maximum number of iterations
        tol (float): A convergence tolerance
    """
    def __init__(self, k, maxiter, tol):
        """Set the paramters.

        Parameters:
            k (int): The number of clusters
            maxiter (int): The maximum number of iterations
            tol (float): A convergence tolerance
        """
        self.k = k   # Initialize some attributes.
        self.maxiter = maxiter
        self.tol = tol

    def fit(self, X):
        """Accepts an mxn matrix X of m data points with n features.
        """
        m,n = X.shape
        self.m = m
        self.n = n

        #Choose k random rows of X as the initial cluster centers.
        initial_cluster_centers = []

        sample = np.random.randint(0,m,size=self.k)

        initial_cluster_centers = X[sample, :]

        # Run the k-means iteration until consecutive centers are within the convergence tolerance, or until 
        # iterating the maximum number of times.
        iterations = 0
        old_cluster = np.zeros(initial_cluster_centers.shape)
        new_cluster = initial_cluster_centers

        while iterations < maxiter or np.linalg.norm(old_cluster - new_cluster) >= tol:
            #assign each data point to the cluster center that is closest, forming k clusters
            clusters = np.zeros(m)
            for i in range(0,m):
                distances = np.linalg.norm(X[i] - initial_cluster_centers, ord=2, axis=1) # axis=1 was crucial
                cluster = np.argmin(distances)                                            #in getting this to work
                clusters[i] = cluster
            # Store the old/initial centroid values
            old_cluster = np.copy(new_cluster)
            #Recompute the cluster centers as the means of the new clusters
            for i in range(k):
                points = [X[j] for j in range(m) if clusters[j] == i]
                new_cluster[i] = np.mean(points, axis=0)
                #If a cluster is empty, reassign the cluster center as a random row of X.
                if new_cluster[i] == []:
                    new_cluster[i] = X[np.random.randint(0,m,size=1)]
            iterations += 1

        #Save the cluster centers as attributes.
        self.new_cluster = new_cluster

        #print("New cluster centers:\n", new_cluster)

        return new_cluster

    def predict(self, X):
        """Accept an l × n matrix X of data.
        """
        # Return an array of l integers where the ith entry indicates which 
        # cluster center the ith row of X is closest to.
        clusters = np.zeros(self.m)
        for i in range(0,self.m):
            distances = np.linalg.norm(X[i] - self.new_cluster, ord=2, axis=1)
            cluster = np.argmin(distances)
            clusters[i] = cluster

        print("\nClusters:", clusters)

        return clusters

X = features_scaled

k = 3
maxiter = 1000
tol = 1e-9

test_KMeans = KMeans(k,maxiter,tol)
test_KMeans.fit(X)
clusters = test_KMeans.predict(X)

pca = PCA(n_components=2)

pr_components = pca.fit_transform(X) # these are the first 2 principal components

#plot the first two principal components as a scatter plot, where the color of each point is det by the clusters
plt.scatter(pr_components[:,0], pr_components[:,1],
           c=clusters, edgecolor='none', alpha=0.5, #color by clusters
            cmap=plt.cm.get_cmap('tab10', 3)) 
plt.xlabel('principal component 1')
plt.ylabel('principal component 2')
plt.colorbar()
plt.title("K-Means Clustering:")
plt.show()

Answer 1

The traceback seems to show Jupyter is out of sync with the current state of code in Kmeans.py (because it points to line 44... which is empty). 追溯似乎表明Jupyter与Kmeans.py中的当前代码状态不同步（因为它指向第44行...这是空的）。 Therefore, if the computation doesn't take too long, you might try fixing the problem by quitting and restarting Jupyter. 因此，如果计算时间不长，您可以尝试退出并重新启动Jupyter，以解决问题。

Python executes the module's code when the module is imported. 导入模块时，Python执行模块的代码。 If you make changes to the module's code after the module is imported, those changes are not reflected in the state of the Python interpreter. 如果在导入模块后对模块的代码进行了更改，则这些更改不会反映在Python解释器的状态中。 This may explain why the Jupyter notebook's error seemed out of sync with the state of Kmeans.py. 这可以解释为什么Jupyter笔记本的错误似乎与Kmeans.py的状态不同步。

Instead of quitting and restarting Python, you can also reload modules . 除了退出并重新启动Python外，您还可以重新加载modules 。 For example, in Python3.4 or newer, you could use 例如，在Python3.4或更高版本中，您可以使用

import sys
import importlib
from Kmeans import Kmeans

# make changes to Kmeans.py
importlib.reload(sys.modules['Kmeans'])
# now the Python interpreter should be aware of changes made to Kmeans.py

However, using IPython, there is an easier way. 但是，使用IPython，有一种更简单的方法。 You could enable autoreloading : 您可以启用自动重新加载：

From the command line run: 从命令行运行：

ipython profile create

Then edit ~/.ipython/profile_default/ipython_config.py by adding 然后通过添加~/.ipython/profile_default/ipython_config.py

c.InteractiveShellApp.extensions = ['autoreload']     
c.InteractiveShellApp.exec_lines = ['%autoreload 2']

Quit and restart IPython to make this change effective. 退出并重新启动IPython以使更改生效。 Now, IPython will automatically reload any module when a change is made to the underlying code which defines that module. 现在，当对定义该模块的基础代码进行更改时，IPython将自动重新加载任何模块。 In most situations autoreload works well, but there are situations where it may fail to reload the module. 在大多数情况下，自动重新加载效果很好，但是在某些情况下，自动重新加载可能会失败。 See the docs for more on autoreload and its caveats. 有关自动重载及其注意事项的更多信息，请参阅文档。

导入模块时，出现有关未定义某些变量的NameError（即使它是）

问题描述

1 个解决方案

解决方案1
1 已采纳 2018-12-21 18:04:02

导入模块时，出现有关未定义某些变量的NameError（即使它是）

问题描述

1 个解决方案

解决方案1 1 已采纳 2018-12-21 18:04:02

解决方案1
1 已采纳 2018-12-21 18:04:02