简体   繁体   English

如何使用 Python 将一个 excel 拆分为多个 excel,并在所有新 excel 中分配相同的行数?

[英]How to split one excel into multiple excel with common number of rows distribution across all the new excel using Python?

I have one excel with large data.我有一个大数据的excel。 I want to split this excel into multiple excel with equal distribution of rows.我想将这个 excel 拆分成多个 excel,行分布相等。

My current code is working partially as it is distributing required number of rows and creating multiple excel.我当前的代码正在部分工作,因为它正在分配所需的行数并创建多个 excel。 but at the same time it is keep creating more excel by considering the rows number.但与此同时,它通过考虑行数不断创建更多的 excel。

In n_partitions if I put number 5 then it is creating excel with 5 rows in two excel and after that it keeps creating three more blank excel.n_partitions 中,如果我输入数字 5,那么它会在两个 excel 中创建 5 行的 excel,然后它会继续创建另外三个空白 excel。 I want my code to stop creating more excel after all the rows gets distributed.我希望我的代码在分发所有行后停止创建更多 excel。

Below is my sample excel with expected result and sample code.下面是我的示例 excel,带有预期结果和示例代码。

在此处输入图像描述

在此处输入图像描述

在此处输入图像描述

Code I am currently using is.我目前使用的代码是。

import pandas as pd

df = pd.read_excel("C:/Zen/TestZenAmp.xlsx")

n_partitions = 5

for i in range(n_partitions):
    sub_df = df.iloc[(i*n_partitions):((i+1)*n_partitions)]
    sub_df.to_excel(f"C:/Zen/-{i}.xlsx", sheet_name="a")

You can use the code below to split your DataFrame into 5-size chunks:您可以使用下面的代码将 DataFrame 拆分为 5 个大小的块:

n = 5
list_df = [df[i:i+n] for i in range(0,df.shape[0],n)]

You can access to every chunk like this:您可以像这样访问每个块:

>>> list_df[0] >>> list_df[0]

在此处输入图像描述

>>> list_df[2] >>> list_df[2]

在此处输入图像描述

Then you can loop through the list of chunks/sub-dataframes and create separate Excel files:然后您可以遍历块/子数据帧列表并创建单独的 Excel 文件:

i=1
for sub_df in list_df:
    sub_df.to_excel(f"C:/Zen/-{i}.xlsx", sheet_name="a", index=False)
    i+=1

Another possible solution:另一种可能的解决方案:

g = df.groupby([df.index // k])
df['id'] = g.ngroup()
(g.apply(lambda x: x.drop('id', 1)
         .to_excel(f"/tmp/x-{pd.unique(x.id)[0]}.xlsx", sheet_name="a")))

that's exactly what I want but in a Java version:(这正是我想要的,但在 Java 版本中:(

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在 Python 中使用 Openpyxl 对多行的 excel 行进行平均? - How to average across excel rows for multiple rows using Openpyxl in Python? 如何使用 python 将 excel 的一列中的数据拆分为多列 - How to Split data in one column of excel into multiple column using python 将一个 excel 文件拆分为多个,其中 Pandas 具有特定的行数 - Split one excel file into multiple with specific number of rows in Pandas 如何使用 python 将文本文件拆分为行和列并保存在 excel 中 - How to split the text file into rows and columns and save in excel using python Python: How to copy Excel worksheet from multiple Excel files to one Excel file that contains all the worksheets from other Excel files - Python: How to copy Excel worksheet from multiple Excel files to one Excel file that contains all the worksheets from other Excel files 在 python 中跨多行求解公式(类似于 excel)? - Solving formula across multiple rows (similar to excel) in python? 如何使用Python / Pandas在Excel中插入新行(带条件) - How to Insert New Rows (with a condition) in Excel using Python / Pandas 如何使用Python中的循环使用find函数过滤excel中的所有行? - How to filter all rows in excel with find function using loop in Python? 使用python在Excel中编写多行 - Writing multiple rows in excel using python 如何使用python将excel文件的所有行和列组合到另一个excel文件的单个单元格中? - How to combine all rows and columns of an excel file into a single cell of another excel file using python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM