简体   繁体   中英

ValueError: Chunks and shape must be of the same length/dimension

I read book "Introducing Data Science. Big data, machine learning, and more, using Python tools" There is a code in Chapter4 about blocking matrix calculation:

import dask.array as da
import bcolz as bc
import numpy as np
import dask

n = 1e4 #A

ar = bc.carray(np.arange(n).reshape(n/2,2)  , dtype='float64', rootdir     = 'ar.bcolz', mode = 'w') #B
y  = bc.carray(np.arange(n/2), dtype='float64', rootdir = 'yy.bcolz', mode = 'w') #B,

dax = da.from_array(ar, chunks=(5,5)) #C
dy = da.from_array(y,chunks=(5,5)) #C

XTX = dax.T.dot(dax) #D
Xy  = dax.T.dot(dy) #E

coefficients = np.linalg.inv(XTX.compute()).dot(Xy.compute()) #F

coef = da.from_array(coefficients,chunks=(5,5)) #G

ar.flush() #H
y.flush() #H

predictions = dax.dot(coef).compute() #I
print (predictions)

I get ValueError:

ValueError                                Traceback (most recent call last)
<ipython-input-4-7ae8e9cf2346> in <module>()
     10 
     11 dax = da.from_array(ar, chunks=(5,5)) #C
---> 12 dy = da.from_array(y,chunks=(5,5)) #C
     13 
     14 XTX = dax.T.dot(dax) #D

C:\Users\F\Anaconda3\lib\site-packages\dask\array\core.py in from_array(x, chunks, name, lock, fancy, getitem)
   1868     >>> a = da.from_array(x, chunks=(1000, 1000), lock=True)  # doctest: +SKIP
   1869     """
-> 1870     chunks = normalize_chunks(chunks, x.shape)
   1871     if len(chunks) != len(x.shape):
   1872         raise ValueError("Input array has %d dimensions but the supplied "

C:\Users\F\Anaconda3\lib\site-packages\dask\array\core.py in normalize_chunks(chunks, shape)
   1815             raise ValueError(
   1816                 "Chunks and shape must be of the same length/dimension. "
-> 1817                 "Got chunks=%s, shape=%s" % (chunks, shape))
   1818 
   1819     if shape is not None:

ValueError: Chunks and shape must be of the same length/dimension. Got chunks=(5, 5), shape=(5000,)

What the problem is?

Problem is here:

np.arange(n/2).reshape(n)

you create an array of size n/2 and then try to reshape it to size n . You can't change the size with reshape .

It's probably a copy/paste mistake? It's not in your original code and It seems you're doing

np.arange(n).reshape(n/2,2)

elsewhere, which works as long as n is an even number (be careful, if n isn't even this will also fail.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM