简体   繁体   中英

How to create a for loop to randomly select columns from the data frame

How to create a for loop to randomly select columns from the data frame and for the next iteration it should select other columns in python.

First extract the list of columns from dataframe:

cols = df.columns
from random import randint
index1 = randint(0,len(cols)-1)
index2 = randint(index1,len(cols)-1)
sublist1 = cols[index1:index2]

import numpy as np
sublist2 = np.setdiff1d(cols,sublist1)

The other way is to use random.sample() and provide the length of sublist. For example:

col = ['a','b','c','d','e','f','g']
sub_col = random.sample(col,4)
['g', 'f', 'a', 'c']
sub_col2 =list(np.setdiff1d(col,sub_col))
['b', 'd', 'e']

Now you can iterate over two different lists of columns, which don't have any common elements.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM