简体   繁体   English

pandas dataframe 添加特定列值随机排列的行

[英]pandas dataframe add rows that are shuffle of values of specific columns

I have the dataframe:我有 dataframe:

df = b_150 h_200 b_250 h_300 b_350 h_400  c1  c2 q4
       1.    2.    3.     4    5.    6.   3.  4.  4

I want to add rows with possible shuffles between values of b_150, b_250, b_350 and h_200, h_300, h_400我想在 b_150、b_250、b_350 和 h_200、h_300、h_400 的值之间添加可能随机播放的行

So for example例如

df = add_shuffles(df, cols=[b_150, b_250, b350], n=1)
df = add_shuffles(df, cols=[h_200, h_300, h_400], n=1)

I will add 2 combinations (1 for l1 and one for l2) to get:我将添加 2 个组合(1 个用于 l1,一个用于 l2)以获得:

df = b_150 h_200 b_250 h_300 b_350 h_400   c1  c2 q4
       1.    2.    3.     4    5.    6.    3.  4.  4
       3.    2.    5.     4    1.    6.    3.  4.  4 
       1.    2.    3.     6    5.    4.    3.  4.  4

What is the most efficient way to do it?最有效的方法是什么?

Try:尝试:

def columns_shuffler():
    x, y = random.sample(list(cols), 2)
    if y:
        return random.sample(cols[0], len(cols[0])) + cols[1]
    else:
        return cols[0] + random.sample(cols[1], len(cols[1]))

msk = df.columns.str.contains('b')
msk1 = df.columns.str.contains('h')
cols = dict(enumerate([df.columns[msk].tolist(), df.columns[msk1].tolist()]))
out = pd.concat([df, pd.DataFrame(np.c_[np.r_[[df[columns_shuffler()] 
                                         for _ in range(n)]].reshape(n, -1), 
                                        np.tile(df.loc[:, ~(msk | msk1)], (n,1))], 
                                  columns=cols[0]+cols[1]+df.columns[~(msk|msk1)].tolist())])

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将值添加到 pandas df 中的特定行和列 - Add values to specific rows and columns in pandas df 选择列中具有特定值的行,并在 Pandas 数据框中包含具有 NaN 的行 - Select rows with specific values in columns and include rows with NaN in pandas dataframe Pandas 如何根据所有行的值、应用于整个数据帧的特定列值向数据帧添加新列 - Pandas how add a new column to dataframe based on values from all rows, specific columns values applied to whole dataframe 根据值从特定范围列中删除Pandas DataFrame中的行 - Deleting rows in Pandas DataFrame based on values, from a specific range columns 对特定列/行的值之间的pandas数据帧的操作 - Operations on pandas dataframe between values of specific columns / rows Pandas DataFrame索引,选择具有特定列(即NaN值)的行 - Pandas DataFrame indexing, Selecting rows with specific columns that are NaN values Pandas DataFrame:在两个特定的列中获取具有相同值对的行 - Pandas DataFrame: get rows with same pair of values in two specific columns Pandas 数据框根据 groupby 随机打乱连续的值行 - Pandas dataframe randomly shuffle consecutive rows of values based on groupby 如何更改特定列的特定行的值,以及 Pandas 中同一数据框中特定行的值 - how to change the values of specific rows for specifc columns, with the values of specific rows in the same dataframe in pandas pandas 数据框与列和特定行的转换 - pandas dataframe transforms with columns and specific rows
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM