简体   繁体   中英

How to fill pandas dataframe columns with random dictionary values

I'm new to Pandas and I would like to play with random text data. I am trying to add 2 new columns to a DataFrame df which would be each filled by a key (newcol1) + value (newcol2) randomly selected from a dictionary.

countries = {'Africa':'Ghana','Europe':'France','Europe':'Greece','Asia':'Vietnam','Europe':'Lithuania'}

My df already has 2 columns and I'd like something like this :

    Year Approved Continent    Country
0   2016      Yes    Africa      Ghana
1   2016      Yes    Europe  Lithuania
2   2017       No    Europe     Greece

I can certainly use a for or while loop to fill df['Continent'] and df['Country'] but I sense .apply() and np.random.choice may provide a simpler more pandorable solution for that.

Yep, you're right. You can use np.random.choice with map :

df

    Year Approved
0   2016      Yes
1   2016      Yes
2   2017       No

df['Continent'] = np.random.choice(list(countries), len(df))
df['Country'] = df['Continent'].map(countries)

df

    Year Approved Continent    Country
0   2016      Yes    Africa      Ghana
1   2016      Yes      Asia    Vietnam
2   2017       No    Europe  Lithuania

You choose len(df) number of keys at random from the country key-list, and then use the country dictionary as a mapper to find the country equivalents of the previously picked keys.

You could also try using DataFrame.sample() :

df.join(
    pd.DataFrame(list(countries.items()), columns=["continent", "country"])
    .sample(len(df), replace=True)
    .reset_index(drop=True)
)

Which can be made faster if your continent-country map is already a dataframe.


If you're on Python 3.6, another method would be to use random.choices() :

df.join(
    pd.DataFrame(choices([*countries.items()], k=len(df)), columns=["continent", "country"])
)

random.choices() is similar to numpy.random.choice() except that you can pass a list of key-value tuple pairs whereas numpy.random.choice() only accepts 1-D arrays.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM