Below is some dummy code of what I would like to achieve and my question is at the end.I would like to shuffle blocks of data frame (different sizes) in a list in Python. Thanks.
Set up a dummy dictionary:
dummy = {"ID":[1,2,3,4,5,6,7,8,9,10],
"Alphabet":["A","B","C","D","E","F","G","H","I","J"],
"Fruit":["apple","banana","coconut","date","elephant apple","feijoa","guava","honeydew","ita palm","jack fruit"]}
Turn dictionary into data frame:
dummy_df = pd.DataFrame(dummy)
Create blocks of data frame with required size:
blocksize = [1,2,3,4]
blocks = []
i = 0
for j in range(len(blocksize)):
a = blocksize[j]
blocks.append(dummy_df[i:i+a])
i += a
blocks
Below is the output of "blocks". It is 4 blocks of data frame with size of 1-4 rows in a list:
[ ID Alphabet Fruit
0 1 A apple,
ID Alphabet Fruit
1 2 B banana
2 3 C coconut,
ID Alphabet Fruit
3 4 D date
4 5 E elephant apple
5 6 F feijoa,
ID Alphabet Fruit
6 7 G guava
7 8 H honeydew
8 9 I ita palm
9 10 J jack fruit]
I am stuck after the above.
I have tried many different things but kept getting errors. I would like to shuffle those blocks of data frame in the list, then combined them back into a dataframe. Below is an example of the shuffled output. How could I do this please?
Example ideal output:
ID Alphabet Fruit
1 2 B banana
2 3 C coconut
0 1 A apple
6 7 G guava
7 8 H honeydew
8 9 I ita palm
9 10 J jack fruit
3 4 D date
4 5 E elephant apple
5 6 F feijoa
After you have the list, you can shuffle the blocks using random.shuffle
. After that, you can create a new empty dataframe then append each block from the (shuffled) list.
Try this code:
import pandas as pd
import random
dummy = {"ID":[1,2,3,4,5,6,7,8,9,10],
"Alphabet":["A","B","C","D","E","F","G","H","I","J"],
"Fruit":["apple","banana","coconut","date","elephant apple","feijoa","guava","honeydew","ita palm","jack fruit"]}
dummy_df = pd.DataFrame(dummy)
blocksize = [1,2,3,4]
blocks = []
i = 0
for j in range(len(blocksize)):
a = blocksize[j]
blocks.append(dummy_df[i:i+a])
i += a
random.shuffle(blocks) # shuffle blocks in list
dfs = pd.DataFrame() # new empty dataframe
for b in blocks: # each block
dfs = dfs.append(b) # add to dataframe
print(dfs)
Output
ID Alphabet Fruit
3 4 D date
4 5 E elephant apple
5 6 F feijoa
1 2 B banana
2 3 C coconut
6 7 G guava
7 8 H honeydew
8 9 I ita palm
9 10 J jack fruit
0 1 A apple
You can use .sample(frac=1)
to shuffle data directly in dataframe
blocks.append( df[start:end].sample(frac=1) )
And later you can use df.append(list_of_df)
to join all dataframes
at once.
df = blocks[0].append(blocks[1:])
import pandas as pd
dummy = {
"ID": [1,2,3,4,5,6,7,8,9,10],
"Alphabet": ["A","B","C","D","E","F","G","H","I","J"],
"Fruit": ["apple","banana","coconut","date","elephant apple","feijoa","guava","honeydew","ita palm","jack fruit"]
}
df = pd.DataFrame(dummy)
blocksize = [1,2,3,4]
blocks = []
start = 0
for size in blocksize:
end = start + size
blocks.append(df[start:end].sample(frac=1))
start = end
#for item in blocks:
# print(item)
df = blocks[0].append(blocks[1:]) # .reset_index(drop=True)
print(df)
Other methods to shuffle: Shuffle DataFrame rows
Other idea is to get only shuffled indexes using .sample(frac=1)
blocks += df[start:end].sample(frac=1).index.tolist()
or random.shuffle()
indexes = df[start:end].index.tolist()
random.shuffle(indexes)
blocks += indexes
and later use these indexes to create new DataFrame
df = df.iloc[blocks]
import pandas as pd
import random
dummy = {
"ID": [1,2,3,4,5,6,7,8,9,10],
"Alphabet": ["A","B","C","D","E","F","G","H","I","J"],
"Fruit": ["apple","banana","coconut","date","elephant apple","feijoa","guava","honeydew","ita palm","jack fruit"]
}
df = pd.DataFrame(dummy)
blocksize = [1,2,3,4]
blocks = []
start = 0
for size in blocksize:
end = start + size
#blocks += df[start:end].sample(frac=1).index.tolist()
indexes = df[start:end].index.tolist()
random.shuffle(indexes)
blocks += indexes
start = end
#for item in blocks:
# print(item)
df = df.iloc[blocks]
print(df)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.