简体   繁体   English

在for循环中使用数组python

[英]Using arrays in for loops python

I am trying to run all the elements in just_test_data to all the elements in just_train_data , and return the lowest number, then run the new just_test_data through all the just_train_data , and so on until all the just_test_data has been run. 我想中的所有元素运行just_test_data在所有元素just_train_data ,并返回最低的数字,然后运行新just_test_data通过所有的just_train_data ,依此类推,直到所有的just_test_data已运行。

The error I keep getting is in the line 我不断收到的错误在行中

step_1 = (abs(just_test_data[i] - just_train_data[n]) ** 2)

IndexError: arrays used as indices must be of integer (or boolean) type

When I first try to run the loop. 当我第一次尝试运行循环时。

import numpy as np
testing_data = np.genfromtxt("C:\Users\zkrumlinde\Desktop\Statistical Programming\Week 3\iris-testing-data.csv", delimiter= ',')
training_data = np.genfromtxt("C:\Users\zkrumlinde\Desktop\Statistical Programming\Week 3\iris-training-data.csv", delimiter= ',')

#create 4 arrays, the first two with the measurements of training and testing data
#the last two have the labels of each line
just_test_data = np.array(testing_data[:, 0:4])
just_train_data = np.array(training_data[:, 0:4])
testing_labels = np.array(testing_data[:, 4])
training_labels = np.array(training_data[:, 4])

n = 0
while n < len(just_train_data):
    for i in just_test_data:
        old_distance = 'inf'
        step_1 = (abs(just_test_data[i] - just_train_data[n]) ** 2)
        step_2 = sum(step_1)
        new_distance = np.sqrt(step_2)
        if new_distance < old_distance:
            old_distance = new_distance
            index = n
        n = n + 1
print(training_labels[index])

when you say for i in just_test_data: i will be the element itself, not the index. 当您for i in just_test_data:for i in just_test_data:我将是元素本身,而不是索引。

you probably want something like for i in range(len(just_test_data)) this will have i as a number from 0 to the length of just_test_data - 1 . 您可能想要类似for i in range(len(just_test_data))东西for i in range(len(just_test_data))这将使i为从0just_test_data - 1的长度的just_test_data - 1

edit: a few weird things in your code: 编辑:代码中的一些奇怪的事情:

step_1 = (abs(just_test_data[i] - just_train_data[n]) ** 2)
step_2 = sum(step_1)
new_distance = np.sqrt(step_2)

this just returns abs(just_test_data[i] - just_train_data[n]) . 这只是返回abs(just_test_data[i] - just_train_data[n]) are you meaning to add a ton of step_1 up and then eventually take the sqrt ? 您是要添加一吨step_1 ,然后最终使用sqrt吗? you need to check your indents. 您需要检查您的缩进。

old_distance = 'inf' is a string (pretty sure). old_distance = 'inf'是一个字符串(非常确定)。 you are probably looking for either np.inf or float('inf') . 您可能正在寻找np.inffloat('inf') Also because you set this inside the for loop, it is getting reset for every i . 另外,因为您在for循环中设置了此值,所以每个i都会重置它。 you probably want it above 'for i in just_test_data:' 您可能希望它在“ for just_test_data中的我:”上方

a quick pass at your code: 快速通过您的代码:

min_distance = np.inf
for n in range(len(just_train_data)):
    step_2 = 0
    for i in range(len(just_test_data)):
        step_1 = (just_test_data[i] - just_train_data[n]) ** 2
        step_2 += step_1
    distance = np.sqrt(step_2)
    if distance < min_distance:
        min_distance = distance
        index = n
print(training_labels[index])

This compares a point in just_train_data to all the points in just_test_data to compute a distance. 这会将just_train_data一个点与just_train_data中的所有点进行just_test_data以计算距离。 It will print the minimum of these distances. 它将打印这些距离中的最小值。

By using for i in just_test_data you're iterating through all the elements in the just_test_data array and not and index between 0 and the array length. 通过for i in just_test_data使用for i in just_test_data您将遍历just_test_data数组中的所有元素,而不是遍历0和数组长度之间的索引。

Also, it seems that your n = n + 1 line is not indented correctly. 另外,似乎您的n = n + 1行没有正确缩进。

Here's my guess for an updated version of your code: 这是我对代码更新版本的猜测:

import numpy as np
testing_data = np.genfromtxt("C:\Users\zkrumlinde\Desktop\Statistical Programming\Week 3\iris-testing-data.csv", delimiter= ',')
training_data = np.genfromtxt("C:\Users\zkrumlinde\Desktop\Statistical Programming\Week 3\iris-training-data.csv", delimiter= ',')

#create 4 arrays, the first two with the measurements of training and testing data
#the last two have the labels of each line
just_test_data = np.array(testing_data[:, 0:4])
just_train_data = np.array(training_data[:, 0:4])
testing_labels = np.array(testing_data[:, 4])
training_labels = np.array(training_data[:, 4])

n = 0
while n < len(just_train_data):
    for i in range(len(just_test_data)):
        old_distance = 'inf'
        step_1 = (abs(just_test_data[i] - just_train_data[n]) ** 2)
        step_2 = sum(step_1)
        new_distance = np.sqrt(step_2)
        if new_distance < old_distance:
            old_distance = new_distance
            index = n
    n = n + 1
print(training_labels[index])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM