Python: Split dataset in 2 dataset

Question

Given 2 datasets, training and testing, I want to divide training dataset into xtrain and ytrain and testing into xtest and ytest. I have the code for octave

X_tr = D_tr(:, 1:end-1);
y_tr = D_tr(:, end);
X_ts = D_ts(:, 1:end-1);
y_ts = D_ts(:, end);

but not able to understand how to convert that into python

Answer 1

Use sklearn.model_selection.train_test_split :

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = \
    train_test_split(X, y, test_size=0.33, random_state=42)

Demo: how to split data sets (horizontally) using np.split :

In [68]: TR = np.random.randint(10, size=(5,5))

In [69]: TR
Out[69]:
array([[2, 9, 9, 0, 3],
       [5, 5, 6, 0, 3],
       [7, 1, 6, 1, 0],
       [5, 0, 2, 0, 4],
       [2, 5, 9, 4, 2]])

In [70]: X_tr, y_tr = np.split(TR, [-1], axis=1)

In [71]: X_tr
Out[71]:
array([[2, 9, 9, 0],
       [5, 5, 6, 0],
       [7, 1, 6, 1],
       [5, 0, 2, 0],
       [2, 5, 9, 4]])

In [72]: y_tr
Out[72]:
array([[3],
       [3],
       [0],
       [4],
       [2]])

PS the same technique would be used for splitting a test data set

Python: Split dataset in 2 dataset

Question

1 answers

solution1
0 2017-11-06 19:34:12

Python: Split dataset in 2 dataset

Question

1 answers

solution1 0 2017-11-06 19:34:12

solution1
0 2017-11-06 19:34:12