简体   繁体   中英

Sklearn train test split

I have loaded in a dataset from UCI (Contraceptive Method Choice Data Set) and need to perform a sklearn train/test/split. When I try using:

X_train, X_test, Y_train, Y_test = train_test_split(contraception_data, contraception_data.target, test_size = 0.5, random_state = 1)

I get this error when I run the code, any reasons why:

AttributeError: 'DataFrame' object has no attribute 'target'

The dataset in question has 10 columns, the last of which is the target variable (contraceptive method used).

Since there are no column names, the easiest way to select X and y is to use the iloc method of dataframes, which selects by index or slices. Here, you can use [:, :-1] to obtain all rows and all columns but the last and [:, -1] for all rows and only the last column.

from sklearn.model_selection import train_test_split

X_train, X_test, Y_train, Y_test = train_test_split(contraception_data.iloc[:, :-1], contraception_data.iloc[:, -1], test_size=0.5, random_state=1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM