循环通过 excel 工作表并根据条件将每个工作表保存到 csv

Question

I have an excel file that has multiple sheets.我有一个包含多张纸的 excel 文件。 I would like to iterate through each sheet and check against a string to either read the file after 0 rows or 4 rows.我想遍历每张纸并检查字符串以在 0 行或 4 行之后读取文件。 (As some of the sheets datasets start after the first 4 rows) After the sheet gets read I want to save the file as a csv. （因为一些工作表数据集在前 4 行之后开始）在工作表被读取后，我想将文件保存为 csv。

This is my code so far, but I am not sure if I am doing the loop correctly.到目前为止，这是我的代码，但我不确定我是否正确地执行了循环。

import pandas as pd


def converToCsv(excel_file): 
    df = pd.read_excel(excel_file, sheet_name = None)
    
    for sheets in df.items():
        if sheets[df.items()] == 'Shipment':
                  newdf = pd.read_excel(excel_file, sheet_name = sheets[df.items(), header = 4]
                  newdf.to_csv('path', decimal = ',', index = False)
        else:
                  newdf = pd.read_excel(excel_file, sheet_name = sheets[df.items(), header = 0]
                  newdf.to_csv('path', decimal = ',', index = False)

Answer 1

There are several things not working with the snippet you posted:有几件事不适用于您发布的代码段：

sheets[df.items()] is not valid. sheets[df.items()]无效。 df.items() returns a dict like: {<sheet_name>: <sheet_content>} , so you can not use it as an index df.items()返回一个类似的字典： {<sheet_name>: <sheet_content>} ，所以你不能将它用作索引
missing parenthesis and misplacement of square brackets缺少括号和方括号错位
admitting that the loop worked, you are always saving the data to the same file path , doing so you are overwriting the previously saved sheet to csv on each loop turn.承认循环工作，您总是将数据保存到相同的文件path ，这样做您将在每个循环转弯时将先前保存的工作表覆盖到 csv。

Did you try running this code before posting it?您是否在发布之前尝试运行此代码？

You could do something along those lines:你可以按照这些思路做一些事情：

import pandas as pd


def converToCsv(excel_file): 
    workbook = pd.read_excel(excel_file, sheet_name = None)
    
    for sheet_name in workbook.keys():
        header = 0
        if sheet_name == 'Shipment':
            header = 4    
        newdf = pd.read_excel(excel_file, sheet_name = sheet_name, header=header)
        # TODO: handle the case where sheet name is not a valid file name
        newdf.to_csv(f"{sheet_name}.csv", decimal = ',', index = False)

converToCsv("test.xlsx")

循环通过 excel 工作表并根据条件将每个工作表保存到 csv

问题描述

1 个解决方案

解决方案1
0 2022-08-17 15:39:11

循环通过 excel 工作表并根据条件将每个工作表保存到 csv

问题描述

1 个解决方案

解决方案1 0 2022-08-17 15:39:11

解决方案1
0 2022-08-17 15:39:11