[英]Trying to shuffle rows in Panda DataFrame
Hopefully, someone can help, I'm trying to randomize the output 15 times, and save into excel, however, python is only giving me 1 output instead of 15希望有人可以提供帮助,我正在尝试将输出随机化 15 次,然后保存到 excel 中,但是,python 只给我 1 个输出而不是 15 个
import pandas as pd
# create a DataFrame
sales_to_do = {'Task': ['Call with the client', 'Preparing for the calls', 'Training staff',
'Daily tasks (Emails, questions, chasing)'],
'Type of task': ['Call (external - new lead)', 'Preparing communication with
leads', 'Training',
'Call (external - new lead)']}
df = pd.DataFrame(sales_to_do)
df_shuffled = df.sample(frac=1)
def randomize():
df = pd.DataFrame(sales_to_do)
df_shuffled = df.sample(frac=1)
print(df_shuffled)
for i in range(15):
randomize()
df_shuffled.to_excel(r'C:\Users\Alex\Desktop\Output1.xlsx', index=False, header=True)
You need to review the scoping rules.您需要查看范围规则。 You have two independent variables named df_shuffled<\/code> , one each in
randomize<\/code> and your main program.
您有两个名为
df_shuffled<\/code>的自变量,一个在
randomize<\/code>和您的主程序中。
You never link the two.你永远不会将两者联系起来。 As a result, all that
randomize<\/code> does is to shuffle the local DF and print the result -- the main program never references that ordering.
因此,
randomize<\/code>所做的只是打乱本地 DF 并打印结果——主程序从不引用该排序。
At the end of your main, you simply dump the unchanged
df_shuffled<\/code> from your first sampling.
在 main 结束时,您只需从第一次采样中转储未更改的
df_shuffled<\/code> 。
Also, you haven't concatenated the 15 shufflings;此外,您还没有连接 15 次改组; it's not clear what result you expect.目前尚不清楚您期望什么结果。 However, I can help with the function.但是,我可以帮助使用该功能。
def randomize():
df = pd.DataFrame(sales_to_do)
df_shuffled = df.sample(frac=1)
return df_shuffled
for i in range(15):
df_shuffled = randomize()
# Adapt this output to append results per your needs
df_shuffled.to_excel(r'C:\Users\Alex\Desktop\Output1.xlsx',
index=False, header=True)
Something like this where you just return the shuffled df, and use pd.concat<\/code> on a list of these.
像这样的东西,你只需返回洗牌的 df,并在这些列表上使用
pd.concat<\/code> 。
sales_to_do = pd.DataFrame({'id':[1,2], 'name':['bob','mike']})
def randomize(df):
return df.sample(frac=1)
df_shuffled = pd.concat([randomize(sales_to_do) for x in range(15)])
df_shuffled.to_excel(r'C:\Users\Alex\Desktop\Output1.xlsx', index=False, header=True)
Currently the easiest and quick way to do it is using shuffle<\/code> from
sklearn.utils<\/code>
目前最简单快捷的方法是使用
sklearn.utils<\/code>中的
shuffle<\/code>
from sklearn.utils import shuffle
df = shuffle(df, random_state=0)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.