[英]How to split one excel into multiple excel with common number of rows distribution across all the new excel using Python?
I have one excel with large data.我有一个大数据的excel。 I want to split this excel into multiple excel with equal distribution of rows.
我想将这个 excel 拆分成多个 excel,行分布相等。
My current code is working partially as it is distributing required number of rows and creating multiple excel.我当前的代码正在部分工作,因为它正在分配所需的行数并创建多个 excel。 but at the same time it is keep creating more excel by considering the rows number.
但与此同时,它通过考虑行数不断创建更多的 excel。
In n_partitions if I put number 5 then it is creating excel with 5 rows in two excel and after that it keeps creating three more blank excel.在n_partitions 中,如果我输入数字 5,那么它会在两个 excel 中创建 5 行的 excel,然后它会继续创建另外三个空白 excel。 I want my code to stop creating more excel after all the rows gets distributed.
我希望我的代码在分发所有行后停止创建更多 excel。
Below is my sample excel with expected result and sample code.下面是我的示例 excel,带有预期结果和示例代码。
Code I am currently using is.我目前使用的代码是。
import pandas as pd
df = pd.read_excel("C:/Zen/TestZenAmp.xlsx")
n_partitions = 5
for i in range(n_partitions):
sub_df = df.iloc[(i*n_partitions):((i+1)*n_partitions)]
sub_df.to_excel(f"C:/Zen/-{i}.xlsx", sheet_name="a")
You can use the code below to split your DataFrame into 5-size chunks:您可以使用下面的代码将 DataFrame 拆分为 5 个大小的块:
n = 5
list_df = [df[i:i+n] for i in range(0,df.shape[0],n)]
You can access to every chunk like this:您可以像这样访问每个块:
list_df[0]
list_df[0]
list_df[2]
list_df[2]
Then you can loop through the list of chunks/sub-dataframes and create separate Excel files:然后您可以遍历块/子数据帧列表并创建单独的 Excel 文件:
i=1
for sub_df in list_df:
sub_df.to_excel(f"C:/Zen/-{i}.xlsx", sheet_name="a", index=False)
i+=1
Another possible solution:另一种可能的解决方案:
g = df.groupby([df.index // k])
df['id'] = g.ngroup()
(g.apply(lambda x: x.drop('id', 1)
.to_excel(f"/tmp/x-{pd.unique(x.id)[0]}.xlsx", sheet_name="a")))
that's exactly what I want but in a Java version:(这正是我想要的,但在 Java 版本中:(
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.