简体   繁体   English

洗牌 pandas dataframe

[英]Shuffling a pandas dataframe

I have the following dataframe:我有以下 dataframe:

df = pd.DataFrame({'A':range(10), 'B':range(10), 'C':range(10), 'D':range(10)})

I would like to shuffle the data using the below function:我想使用以下 function 对数据进行洗牌:

import pandas as pd

import numpy as np

def shuffle(df, n=1, axis=0):
    df = df.copy()
    for _ in range(n):
        df.apply(np.random.shuffle, axis=axis)
        return df

However I do not want to shuffle columns A and D, only columns B and C.但是我不想改组 A 列和 D 列,只改组 B 列和 C。 Is there a way to do this by amending the function?有没有办法通过修改 function 来做到这一点? I want to say if column == 'A' or 'D' then don't shuffle.我想说如果 column == 'A' 或 'D' 然后不要洗牌。

Thanks谢谢

You could shuffle the required columns as below:您可以将所需的列打乱如下:

import numpy as np
import pandas as pd

# the data 
df = pd.DataFrame({'A':range(10), 'B':range(10), 
     'C':range(10), 'D':range(10)}) 

# shuffle 
df.B = np.random.permutation(df.B)
df.C =  np.random.permutation(df.C) 

# or shuffle this way (in place)
np.random.shuffle(df.B)
np.random.shuffle(df.C)

If you need to shuffle using your shuffle function:如果您需要使用随机播放 function 进行随机播放:

def shuffle(df, n=1):

   for _ in range(n):
        # shuffle B
        np.random.shuffle(df.B)
        # shuffle C
        np.random.shuffle(df.C)
        print(df.B,df.C)   # comment this out as needed

    return df

You do not need to disturb columns A and D.您无需打扰 A 列和 D 列。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM