[英]Shuffling a pandas dataframe
I have the following dataframe:我有以下 dataframe:
df = pd.DataFrame({'A':range(10), 'B':range(10), 'C':range(10), 'D':range(10)})
I would like to shuffle the data using the below function:我想使用以下 function 对数据进行洗牌:
import pandas as pd
import numpy as np
def shuffle(df, n=1, axis=0):
df = df.copy()
for _ in range(n):
df.apply(np.random.shuffle, axis=axis)
return df
However I do not want to shuffle columns A and D, only columns B and C.但是我不想改组 A 列和 D 列,只改组 B 列和 C。 Is there a way to do this by amending the function?有没有办法通过修改 function 来做到这一点? I want to say if column == 'A' or 'D' then don't shuffle.我想说如果 column == 'A' 或 'D' 然后不要洗牌。
Thanks谢谢
You could shuffle the required columns as below:您可以将所需的列打乱如下:
import numpy as np
import pandas as pd
# the data
df = pd.DataFrame({'A':range(10), 'B':range(10),
'C':range(10), 'D':range(10)})
# shuffle
df.B = np.random.permutation(df.B)
df.C = np.random.permutation(df.C)
# or shuffle this way (in place)
np.random.shuffle(df.B)
np.random.shuffle(df.C)
If you need to shuffle using your shuffle function:如果您需要使用随机播放 function 进行随机播放:
def shuffle(df, n=1):
for _ in range(n):
# shuffle B
np.random.shuffle(df.B)
# shuffle C
np.random.shuffle(df.C)
print(df.B,df.C) # comment this out as needed
return df
You do not need to disturb columns A and D.您无需打扰 A 列和 D 列。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.