将一个 excel 文件拆分为多个，其中 Pandas 具有特定的行数

Question

Let's say I have an excel file with 101 rows, I need to split and write into 11 excel files with equivalent row number 10 for each new file, except the last one since there is only one row left.假设我有一个包含101行的 excel 文件，我需要拆分并写入11 excel 文件，每个新文件的等效行号为10 ，最后一个文件除外，因为只剩下一行。

This is code I have tried, but I get KeyError: 11 :这是我试过的代码，但我得到KeyError: 11 ：

df = pd.DataFrame(data=np.random.rand(101, 3), columns=list('ABC'))
groups = df.groupby(int(len(df.index)/10) + 1)
for i, g in groups:
    g.to_excel("%s.xlsx" % i, index = False, index_lable = False)

Someone could help with this issue?有人可以帮助解决这个问题吗？ Thanks a lot.非常感谢。

Reference related: Split pandas dataframe into multiple dataframes with equal numbers of rows参考相关： Split pandas dataframe into multiple dataframes with equal numbers of rows

Answer 1

I think you need np.arange :我认为你需要np.arange ：

df = pd.DataFrame(data=np.random.rand(101, 3), columns=list('ABC'))
groups = df.groupby(np.arange(len(df.index))//10)
for i, g in groups:
    print (g)

Answer 2

I solved a similar problem as follows.我解决了类似的问题如下。 Backstory to my issue was that I have created an Azure Function with an HTTP trigger, but was overwhelming the endpoint when iterating through 2k rows of requests.我的问题的背景是我创建了一个 Azure Function 和一个 HTTP 触发器，但是在遍历 2k 行请求时压倒了端点。 So chunked up the origin file into rows of 50:所以将原始文件分成 50 行：

import pandas as pd
import logging

INXL = pd.read_excel('split/031022.xlsx', engine="openpyxl")


row_count = (len(INXL.index))
#make sure we are dealing with a table bigger than 50    
if row_count >= 51:
    row_start = (row_count -50)
else:
   row_start = 1


def extract(rs, rc):
   while rc >= 51: #loop body
        # set the extraction to be between the row start and ending index
        row_extract = INXL.iloc[rs:rc]
        with pd.ExcelWriter(f'output_{rc}.xlsx') as writer: 
            row_extract.to_excel(writer,index=False)
        rc -= 50
        rs -= 50
        

extract(row_start, row_count)
if row_count < 51:
    row_extract = INXL.iloc[row_start:row_count]
    with pd.ExcelWriter(f'output_{row_count}.xlsx') as writer: 
        row_extract.to_excel(writer,index=False) 
        logging.info("extract completed")

将一个 excel 文件拆分为多个，其中 Pandas 具有特定的行数

问题描述

2 个解决方案

解决方案1
2 已采纳 2020-01-31 08:08:21

解决方案2
1 2022-11-22 09:04:13

将一个 excel 文件拆分为多个，其中 Pandas 具有特定的行数

问题描述

2 个解决方案

解决方案1 2 已采纳 2020-01-31 08:08:21

解决方案2 1 2022-11-22 09:04:13

解决方案1
2 已采纳 2020-01-31 08:08:21

解决方案2
1 2022-11-22 09:04:13