简体   繁体   English

Python:用从同一列中选取的随机值填充 dataframe 中的 NaN

[英]Python: fill NaN in dataframe with random values picked from the same column

I have a dataframe with some NaN values like the one below and I would like to fill in the nan values in a column with random picks from the same column.我有一个 dataframe,它有一些 NaN 值,如下所示,我想用同一列中的随机选择填充一列中的 nan 值。 eg randomly pick values from Col1 to fill in the NaN-values in Col1例如,从 Col1 中随机选取值以填充 Col1 中的 NaN 值

   Col1      Col2      Col3      Col4   Col5
0  -0.671603 -0.792415  0.783922 NaN    Blue
1   0.207720       NaN  0.996131 Tom    Yellow
2  -0.892115 -1.282333       NaN Julia  NaN
3  -0.315598 -2.371529 -1.959646 NaN    Pink
4        NaN       NaN -0.584636 NaN    Orange
5   0.314736 -0.692732 -0.303951 Jim    NaN
6   0.355121       NaN       NaN NaN    Red
7        NaN -1.900148  1.230828 Sophia NaN
8  -1.795468  0.490953       NaN Anne   Blue
9  -0.678491 -0.087815       NaN NaN    NaN
10  0.755714  0.550589 -0.702019 NaN    Pink
11  0.951908 -0.529933  0.344544 Tobi   Yellow
12       NaN  0.075340 -0.187669 Jon    Red
13       NaN  0.314342 -0.936066 NaN    Yellow
14       NaN  1.293355  0.098964 Peter  Orange

Any idears?有什么想法吗?

I have tried something like this:我试过这样的事情:

import numpy as np
import pandas as pd

num_nan= df[col_name].isna().sum()
for n in len(range(num_nan)):
  #pick random value from e.g. col1 that's not NaN
  df[col_name] = df[col_name].where((pd.notnull(df)), None).sample(random_state= 1)     
  #replace NaN-value in e.g. col1 with picked value
  df[col_name]= df.fillna('value')`

to replace the NaN-value sin a columne with a random pick from the same column用同一列中的随机选择替换列中的 NaN 值

You can try:你可以试试:

for c in df:
    mask = df[c].isna()
    df.loc[mask, c] = np.random.choice(df.loc[~mask, c], size=(mask.sum(), 1))

print(df)

Prints (for example):打印(例如):

        Col1      Col2      Col3    Col4    Col5
0  -0.671603 -0.792415  0.783922     Jon    Blue
1   0.207720 -1.900148  0.996131     Tom  Yellow
2  -0.892115 -1.282333 -0.702019   Julia     Red
3  -0.315598 -2.371529 -1.959646    Tobi    Pink
4  -0.892115  0.075340 -0.584636     Jon  Orange
5   0.314736 -0.692732 -0.303951     Jim    Pink
6   0.355121 -0.792415  0.344544     Tom     Red
7  -0.892115 -1.900148  1.230828  Sophia     Red
8  -1.795468  0.490953 -0.303951    Anne    Blue
9  -0.678491 -0.087815  0.344544     Jon  Yellow
10  0.755714  0.550589 -0.702019   Peter    Pink
11  0.951908 -0.529933  0.344544    Tobi  Yellow
12 -0.678491  0.075340 -0.187669     Jon     Red
13  0.951908  0.314342 -0.936066   Julia  Yellow
14 -0.892115  1.293355  0.098964   Peter  Orange

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM