Sklearn train test split

Question

I have loaded in a dataset from UCI (Contraceptive Method Choice Data Set) and need to perform a sklearn train/test/split. When I try using:

X_train, X_test, Y_train, Y_test = train_test_split(contraception_data, contraception_data.target, test_size = 0.5, random_state = 1)

I get this error when I run the code, any reasons why:

AttributeError: 'DataFrame' object has no attribute 'target'

Answer 1

The dataset in question has 10 columns, the last of which is the target variable (contraceptive method used).

Since there are no column names, the easiest way to select X and y is to use the iloc method of dataframes, which selects by index or slices. Here, you can use [:, :-1] to obtain all rows and all columns but the last and [:, -1] for all rows and only the last column.

from sklearn.model_selection import train_test_split

X_train, X_test, Y_train, Y_test = train_test_split(contraception_data.iloc[:, :-1], contraception_data.iloc[:, -1], test_size=0.5, random_state=1)

Sklearn train test split

Question

1 answers

solution1
0 2021-04-01 12:30:50

Sklearn train test split

Question

1 answers

solution1 0 2021-04-01 12:30:50

solution1
0 2021-04-01 12:30:50