简体   繁体   中英

How to randomly populate a categorical column in pandas dataframe using pre-defined values

I have two pandas dataframes, first one contains names of more than 50 cities and the second one contains customer details like name, age gender, salary, profession etc. There is no common key between these data frames and their size is also different. I wish to populate a new column named 'Customer City' in the customer details dataframe which should have values chosen from the cities dataframe. In other words for a customer I wish to choose a random city (from the cities dataframe) and add it to a new column named 'Customer City' in the customer dataframe.

Kindly suggest how can this be done in pandas.

Just select them from cities with numpy random choice. Not sure what the cities dataframe looks like, so you might have to change that bit to work with what you have.

import numpy as np

df["Customer City"] = np.random.choice(cities, len(df))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM