简体   繁体   中英

How to make python only return valid data using np.loadtxt

I have made a code for loading a data file. I have made it so an error message will occur if data does not meet specific requirements. However i need it to also only return the valid data. I cannot seem to find a way to do so on my own, so I hope someone can help me. My code is as following

import numpy as np
def dataLoad(filename):
initialData = np.loadtxt(filename)

for i in range (len(initialData)):
    if initialData[i,0]>60 or initialData[i,0]<10:
        print("Temperature must be between 10 and 60. Error in column 1 row {}.".format(initialData.shape[0]))
    if initialData[i,1]<0:
        print("Bacteria Growth Rate must be higher than, or equal to 0. Error in column 2 row row {}.".format(initialData.shape[0]))
    if initialData[i,2] not in [1, 2, 3, 4]:
        print("Bacteria Category must be one of the numbers: 1, 2, 3 and 4. Error in column 3 row {}.".format(initialData.shape[0]))
    else:
        pass
data = initialData.reshape(-1,3)
return data

One solution to the problem would be to keep track of the indices of the invalid rows by storing them in a list, and then deleting them from initialData right before the return using np.delete . The function takes as input an array, a list of indices to be deleted, and and axis along which the datapoints should be deleted, which in this case would be 0 as you are deleting the rows.

Your code would then look something like this :

import numpy as np
def dataLoad():
    del_indices= []
    initialData = np.loadtxt(filename)

    for i in range (len(initialData)):
        if initialData[i,0]>60 or initialData[i,0]<10:
            print("Temperature must be between 10 and 60. Error in column 1 row {}.".format(initialData.shape[0]))
            del_indices.append(i)
        if initialData[i,1]<0:
            print("Bacteria Growth Rate must be higher than, or equal to 0. Error in column 2 row row {}.".format(initialData.shape[0]))
            del_indices.append(i)
        if initialData[i,2] not in [1, 2, 3, 4]:
            print("Bacteria Category must be one of the numbers: 1, 2, 3 and 4. Error in column 3 row {}.".format(initialData.shape[0]))
            del_indices.append(i)
    initialData = np.delete(initialData,del_indices, axis=0)
    data = initialData.reshape(-1,3)
    return data

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM