简体   繁体   中英

Python | numpy | list indices must be integers, not tuple

My data file looks like this

#weight, height and gender
45 145 f
89 154 m
56 163 m
-1 165 f
65 175 m
-1 125 m
65 169 f

as you can see that for 2 entries i have weight as -1, these are outliers and i want to remove them. that is remove that entry that is outlier. So i try to read this file using numpy, as in np.loadtxt, so the code for it goes like

data = np.loadtxt('whData.dat',dtype=np.object,comments='#',delimiter=None)
X = data[:,0:2].astype(np.float)
y = data[:,2]
X = X.T
...

in order to remove the outlier i define a function that iterates the data and returns a new data that has no outliers.

def remove_outlier2(data):
    non_outlier = []
    for x in data:
        if x[0] != '-1':
            non_outlier.append(x)
    return non_outlier

and i call this after loading the data from file, that is

data = np.loadtxt('whData.dat',dtype=np.object,comments='#',delimiter=None)
data = remove_outlier2(data)
np.asarray(data)
X = data[:,0:2].astype(np.float)
y = data[:,2]
X = X.T
...

But now i get this error, which i am not able to resolve.

Traceback (most recent call last):

  File "<ipython-input-2-2aec95447a79>", line 1, in <module>
    runfile('C:/Users/xxx/py_workspace/pattern/whExample.py', wdir='C:/Users/xxx/py_workspace/pattern')

  File "C:\Users\xxx\AppData\Local\Continuum\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 699, in runfile
    execfile(filename, namespace)

  File "C:\Users\xxx\AppData\Local\Continuum\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 74, in execfile
    exec(compile(scripttext, filename, 'exec'), glob, loc)

  File "C:/Users/xxx/py_workspace/pattern/whExample.py", line 79, in <module>
    X = data[:,0:2].astype(np.float)

TypeError: list indices must be integers, not tuple

I also tried to print the data just after reading it from file, and it looks like this in Spyder

[['45' '145' 'f']
    ['89' '154' 'm']
    ['56' '163' 'm']
    ['-1' '165' 'f']
    ['65' '175' 'm']
    ['-1' '125' 'm']
    ['65' '169' 'f']]

I tried to google and find out what i am doing wrong but couldn't figure out. How can i resolve this?

Thanks

So finally from suggestions in the comment section, all i had to do is use the output of np.asarray(), that is

data = np.loadtxt('whData.dat',dtype=np.object,comments='#',delimiter=None)
# reomve outliers
data = remove_outlier2(data)
data = np.asarray(data)

and things worked fine.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM