熊貓-groupby並根據列選擇可變數量的隨機值

Question

從這個簡單的數據框df ：

df = pd.DataFrame({'c':[1,1,2,2,2,2,3,3,3], 'n':[1,2,3,4,5,6,7,8,9], 'N':[1,1,2,2,2,2,2,2,2]})

我正在嘗試從n為每個c選擇N隨機值。 到目前為止，我設法對groupby進行了分組，並獲得了一個單個元素/組：

sample = df.groupby('c').apply(lambda x :x.iloc[np.random.randint(0, len(x))])

返回：

我的預期輸出將是這樣的：

因此根據N列，從c = 1獲得1個樣本，為c = 2和c = 3獲得2個樣本。

Answer 1

熊貓對象現在具有.sample方法以返回隨機數的行：

>>> df.groupby('c').apply(lambda g: g.n.sample(g.N.iloc[0]))
c   
1  1    2
2  5    6
   2    3
3  6    7
   7    8
Name: n, dtype: int64