I have a Pandas data frame that I would like to update a column. Currently the format is like with many many lines. If the value equals DI would like to random choose from a list to replace that value with. For example:
Values
A
B
C
D
my_list = ["E", "F", "G"]
df['Values'] = pd.np.where(df['Values'].str.contains("D"), random.choice(my_list), df['Values'])
When I do this it only grabs one value let's say "F" and replaces all of "D". I would like to go row by row to distribute randomly. So for example if I am replacing 100 D's I might get, 40 "E's" 25 F's and 35 G's. Any thoughts on how I can tweak this?
Thanks!
You can assign
m = df['Values'].str.contains("D")
df.loc[m,'Values']=np.random.choice(my_list,m.sum())
df
Out[27]:
Values
0 A
1 B
2 F
3 E
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.