简体   繁体   中英

Zip scikit-learn datasets

I'm trying to run the following code snippet from a tutorial in python 3.3:

>>> import numpy as np
>>> from sklearn import datasets
>>> iris = datasets.load_iris()
>>> np.array(zip(iris.data, iris.target))[0:10]

In 2.7 it returns the following output:

array([[array([ 5.1,  3.5,  1.4,  0.2]), 0],
   [array([ 4.9,  3. ,  1.4,  0.2]), 0],
   [array([ 4.7,  3.2,  1.3,  0.2]), 0],
   [array([ 4.6,  3.1,  1.5,  0.2]), 0],
   [array([ 5. ,  3.6,  1.4,  0.2]), 0],
   [array([ 5.4,  3.9,  1.7,  0.4]), 0],
   [array([ 4.6,  3.4,  1.4,  0.3]), 0],
   [array([ 5. ,  3.4,  1.5,  0.2]), 0],
   [array([ 4.4,  2.9,  1.4,  0.2]), 0],
   [array([ 4.9,  3.1,  1.5,  0.1]), 0]], dtype=object)

But in 3.3 it returns:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: 0-dimensional arrays can't be indexed

I'm new to python and I know there are differences between 2.x and 3.x, I think this is simply in relation to the differences in print function, but I'd appreciate an explanation of what's happening here and how I can get it to run in 3.3.

The problem here is that zip now returns an iterable rather than a list so you need to convert to a list first:

In [194]:

np.array(list(zip(iris.data, iris.target)))[0:10]
Out[194]:
array([[array([ 5.1,  3.5,  1.4,  0.2]), 0],
       [array([ 4.9,  3. ,  1.4,  0.2]), 0],
       [array([ 4.7,  3.2,  1.3,  0.2]), 0],
       [array([ 4.6,  3.1,  1.5,  0.2]), 0],
       [array([ 5. ,  3.6,  1.4,  0.2]), 0],
       [array([ 5.4,  3.9,  1.7,  0.4]), 0],
       [array([ 4.6,  3.4,  1.4,  0.3]), 0],
       [array([ 5. ,  3.4,  1.5,  0.2]), 0],
       [array([ 4.4,  2.9,  1.4,  0.2]), 0],
       [array([ 4.9,  3.1,  1.5,  0.1]), 0]], dtype=object)

The behaviour of zip changed in python 3, note I get a different error to you when I ran your code:

--------------------------------------------------------------------------- 
IndexError                                Traceback (most recent call last) <ipython-input-193-ec320a0afa3a> in <module>()
          2 from sklearn import datasets
          3 iris = datasets.load_iris()
    ----> 4 np.array(zip(iris.data, iris.target))[0:10]

IndexError: too many indices for array

Also there is more than just print that has changed in python 3.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM