简体   繁体   中英

Replicating a matrix in pandas or numpy to a certain size

I have a matrix A which is (41, 41) which is a dataframe.

B is a matrix of size (7154, 8240), ndarray.

I want replicate A (keeping the whole 41x41 matrix intact) to the size of B. It will not fit exactly, but then it should just clip the rows that does not fit.

This is to be able to multiply A*B.

I tried this code, but I cannot multiply with a float.

repeat = pd.concat([A]*(B.shape[0]/A.shape[0]), axis=0, ignore_index=True)
filter_large = pd.concat([repeat]*(B.shape[1]/A.shape[1]), axis=1, ignore_index=True)

filter_l = filter_large.values   # change from a dataframe to a numpy array
AB = A*filter_l

I should mention that I've tried numpy.resize but it does not keep the matrix intact, mixing up all rows which is not what I want.

This code will do what you ask for:

shapeMultiples = (np.ceil(B.shape[0]/A.shape[0]).astype(int), np.ceil(B.shape[1]/A.shape[1]).astype(int))
res = np.tile(A, shapeMultiples)[:B.shape[0], :B.shape[1]]

Explanation :

np.tile(A, reps) repeats the matrix A multiple times along each axis. How often it is repeated is specified for each axis in reps .

For your example it should be repeated b.shape[0]/a.shape[0] times along axis 0 and b.shape[1]/a.shape[1] times along axis 1. However you have to round these values up, to make sure it extends the size of matrix B , which is what np.ceil does. Since reps is expected to be a shape of integers but ceil returns floats, we have to cast the type to int .

In the final step we cut of the result to make it fit the size of B with [:B.shape[0], :B.shape[1]] .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM