Random rows based on unique values

Question

I want to get 2 random but distinct person IDs from a DF and put it into another DF

Person_id	post	active
567	yes	inactive
678	yes	active
567	no	inactive
689	yes	active
680	yes	inactive
689	no	active

df['person_id'].sample(n=100, random_state=1)

This code is NOT getting the unique person_id values and only putting that column in a df. I need to get a number of that specific column's unique values and put it into a df with all other columns as well.

df.person_id.sample(n=100, random_state=1).groupby('person_id')

I tried this as well but it creates a weird object

Any tips?

Answer 1

df = pd.DataFrame({'person_id': [567, 678, 567, 689, 680, 689],
                   'post': ['yes', 'yes', 'no', 'yes', 'yes', 'no'],
                   'active': ['inactive', 'active', 'inactive', 'active', 'inactive', 'active']})

To select two random unique person ids:

selected = df['person_id'].drop_duplicates().sample(n=2)

To create data frame with all rows for selected person ids:

df[df['person_id'].isin(selected)]

Random rows based on unique values

Question

1 answers

solution1
0 2022-01-05 16:59:54

Random rows based on unique values

Question

1 answers

solution1 0 2022-01-05 16:59:54

solution1
0 2022-01-05 16:59:54