[英]Numpy: would vstack automatically detect an index is out of range and correct it?
我對下面的代碼(我標記為“ HERE”的部分)中的為什么為什么起作用感到困惑,因為當j到達結尾時,j + 1會使列表列表(即X_train_folds)超出范圍的范圍。 為什么這還要起作用? 是因為vstack可以自動檢測到此更改嗎? 但是我找不到任何文檔。
num_folds = 5
k_choices = [1, 3, 5, 8, 10, 12, 15, 20, 50, 100]
X_train_folds = []
y_train_folds = []
################################################################################
# Split up the training data into folds. After splitting, X_train_folds and #
# y_train_folds should each be lists of length num_folds, where #
# y_train_folds[i] is the label vector for the points in X_train_folds[i]. #
# Hint: Look up the numpy array_split function. #
################################################################################
X_train_folds = np.array_split(X_train, num_folds)
y_train_folds = np.array_split(y_train, num_folds)
# print y_train_folds
# A dictionary holding the accuracies for different values of k that we find
# when running cross-validation. After running cross-validation,
# k_to_accuracies[k] should be a list of length num_folds giving the different
# accuracy values that we found when using that value of k.
k_to_accuracies = {}
################################################################################
# Perform k-fold cross validation to find the best value of k. For each #
# possible value of k, run the k-nearest-neighbor algorithm num_folds times, #
# where in each case you use all but one of the folds as training data and the #
# last fold as a validation set. Store the accuracies for all fold and all #
# values of k in the k_to_accuracies dictionary. #
################################################################################
for k in k_choices:
k_to_accuracies[k] = []
for k in k_choices:
print 'evaluating k=%d' % k
for j in range(num_folds):
X_train_cv = np.vstack(X_train_folds[0:j]+X_train_folds[j+1:])#<--------------HERE
X_test_cv = X_train_folds[j]
#print len(y_train_folds), y_train_folds[0].shape
y_train_cv = np.hstack(y_train_folds[0:j]+y_train_folds[j+1:]) #<----------------HERE
y_test_cv = y_train_folds[j]
#print 'Training data shape: ', X_train_cv.shape
#print 'Training labels shape: ', y_train_cv.shape
#print 'Test data shape: ', X_test_cv.shape
#print 'Test labels shape: ', y_test_cv.shape
classifier.train(X_train_cv, y_train_cv)
dists_cv = classifier.compute_distances_no_loops(X_test_cv)
#print 'predicting now'
y_test_pred = classifier.predict_labels(dists_cv, k)
num_correct = np.sum(y_test_pred == y_test_cv)
accuracy = float(num_correct) / num_test
k_to_accuracies[k].append(accuracy)
################################################################################
# END OF YOUR CODE #
################################################################################
# Print out the computed accuracies
for k in sorted(k_to_accuracies):
for accuracy in k_to_accuracies[k]:
print 'k = %d, accuracy = %f' % (k, accuracy)
不會vstack
不會導致這種情況,但是numpy的強大索引功能卻是。 numpy的內部結構很復雜,有時返回一個副本,有時返回一個view 。 但是,在兩種情況下,您都將啟動方法。 特別是當索引本身為空時(如在數組空間之外),此方法將返回一個empty array
。
請參見以下示例和相應的輸出( print
):
import numpy as np
a = np.array([1, 2, 3])
print(a[10:]) # This will return empty
print(a[10]) # This is an error
,結果是:
[]
追溯(最近一次呼叫最近):文件“ C:/Users/imactuallyavegetable/temp.py”,行333,在print(a [10])中出現IndexError:索引10超出了軸3的大小3
首先是一個空數組,其次是異常。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.