简体   繁体   中英

Issues appending numpy arrays during for loop

I'm a bit lost at the moment. I correctly initialized an empty numpy array and I believe i'm using the np.append function correctly

Preds = np.empty(shape = (X_test.shape[0],10))

kf = KFold(n = X_train.shape[0], n_folds=10, shuffle = True)

for kf_train, kf_test in kf:

    X_train_kf = X_train.iloc[kf_train]
    Y_train_kf = Y_train.iloc[kf_train]

    dt = tree.DecisionTreeClassifier(max_depth=2)
    dt.fit(X_train_kf, Y_train_kf)
    Preds = np.append(Preds,dt.predict(X_test))

    print Preds

Just some additional info:

  • X_test has a shape of (9649, 24)

  • (After running) Preds has a shape of (192980,)

At the of this loop, Preds should have a shape of (9649,10)

Any advice would be much appreciated.

EDIT: Here is the updated solution

Preds = []
kf = KFold(n = X_train.shape[0], n_folds=20, shuffle = True)

for kf_train, kf_test in kf:

    X_train_kf = X_train.iloc[kf_train]
    Y_train_kf = Y_train.iloc[kf_train]

    dt = tree.DecisionTreeClassifier(max_depth=2)
    dt.fit(X_train_kf, Y_train_kf)
    Preds.append(dt.predict(X_test))

Preds = np.vstack(Preds)

If Preds is (9649,10), then you can do one of 2 kinds of concatenation

 newPreds = np.concatenate((Preds, np.zeros((N,10))), axis=0)
 newPreds = np.concatenate((Preds, np.zeros((9649,N)), axis=1)

The first produces a (9649+N, 10) array, the second (9646,10+N).

np.vstack can be use to make the 2nd array is 2d, ie it changes (10,) to (1,10) array. np.append takes 2 arguments instead of a list, and makes sure the second is an array. It is better for adding a scalar to a 1d array, than for general purpose concatenation.

Make sure you understand the shapes and number of dimensions of your arrays.

A good alternative is to append to a list

alist = []
alist.append(initial_array)
for ...
    alist.append(next_array)
result = np.concatenate(alist, axis=?)
# vstack, stack, and np.array can be used if dimensions are right

Appending to list, followed by one join at the end is faster than repeated concatenates. Lists are designed to grow cheaply; arrays grow by making a new larger array.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM