简体   繁体   English

根据列值在python中打乱数据帧的行,使具有相同列值的行在一起?

[英]Shuffle the rows of a dataframe in python based on a column value, such that the rows with the same column value are together?

Here's the dataframe I have这是我拥有的dataframe

fruits=pd.DataFrame()
fruits['month']=['jan','feb','feb','march','jan','april','april','june','march','march','june','april']
fruits['fruit']=['apple','orange','pear','orange','apple','pear','cherry','pear','orange','cherry','apple','cherry']
fruits['price']=[30,20,40,25,30 ,45,60,45,25,55,37,60]

fruits

The rows in the dataframe should be shuffled, but the rows with the same month should appear together.数据框中的行应该被打乱,但相同月份的行应该一起出现。 In other words the rows in the dataframe should be shuffled based on the month and then the rows with the same month should be reshuffled amongst one another(2 level shuffle).换句话说,数据帧中的行应该基于月份进行洗牌,然后具有相同月份的行应该彼此重新洗牌(2 级洗牌)。

the output data frame should look something like this:输出数据框应如下所示:

fruits_new=pd.DataFrame()
fruits_new['month']=['april','april','april','feb','feb','jan','jan','march','march','march','jun','jun']
fruits_new['fruit']=['cherry','pear','cherry','pear','orange','apple','apple','orange','orange','cherry','pear','apple']
fruits_new['price']=[60,45,60,40,20,30,30,25,25,55,45,37]

fruits_new

You can use pandas.DataFrame.sample and use fraction as 1, it will randomly take the sample from the dataframe rows, and frac=1 will make it take all the rows.您可以使用pandas.DataFrame.sample并将分数用作 1,它将从数据帧行中随机抽取样本,而frac=1将使其获取所有行。

>>> df.sample(frac=1)

SAMPLE RUN:样品运行:

#Initial dataframe
   0  1  2
0  5  6  A
1  5  8  B
2  6  6  C
3  6  9  D
4  5  8  E

>>> df.sample(frac=1)
#After shuffle
   0  1  2
0  5  6  A
4  5  8  E
1  5  8  B
3  6  9  D
2  6  6  C

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM