简体   繁体   中英

sklearn train test split by year

I have a dataset that goes from 2016 to 2020 with a 'Year' column. I would like to use 2016-2017 as train data and 2018-2020 as test data. Is there any easy method to perform this data split?

You can use groupby function to group all the data in 2016 to 2017 as training data and group data from the year 2018-2020 as test data. Alternatively you can use the following code as well

df_train = df[df['year'].isin(2016,2017)] and df_test = df[df['year'].isin(2018,2019,2020)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM