简体   繁体   English

写入现有的xlsx文件,在Python中覆盖一些工作表

[英]Write to an existing xlsx file, overwriting just some sheets in Python

I have an excel file with several sheets, say, Data 1 , Data 2 and Pivots . 我有一个包含多个工作表的excel文件,比如Data 1Data 2Data 2 Pivots

The sheets Data 1 and Data 2 have one table each. 工作表Data 1Data 2各有一个表。 The sheet Pivots has only pivot tables whose data sources are the tables on Data 1 and Data 2 . 工作表Pivots只有数据透视表,其数据源是Data 1Data 2上的表。

What I'd like to do is to rewrite the data sheets with data in two dataframes, say df1 and df2 respectively, while keeping the pivot tables linked to the same sheets. 我想做的是用两个数据帧中的数据重写数据表,分别是df1df2 ,同时保持数据表链接到相同的表。 The idea is to run a script, replace Data 1 and Data 2 and refresh the pivot tables to get updated data. 我们的想法是运行一个脚本,替换Data 1Data 2并刷新Data 2透视表以获取更新的数据。

pd.ExcelWriter(xlsx_file) won't work because it replaces the file. pd.ExcelWriter(xlsx_file)将无法正常工作,因为它会替换该文件。 The approach below was adapted from this answer. 以下方法改编自这个答案。

import pandas as pd
from openpyxl import load_workbook

book=load_workbook(xlsx_file)
writer = pd.ExcelWriter(xlsx_file, engine = "openpyxl")
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
writer.sheets.pop("Pivots", None)
data_sheets = ["Data 1", "Data 2"]

for (k, df) in enumerate([df1, df2]):
    df.to_excel(writer, data_sheets[k], index=False)
writer.save()

This failed because the sheet Pivots indeed kept the data in the cells, but the pivot table was gone, along with all its formatting. 这失败了,因为工作表Pivots确实将数据保留在单元格中,但数据透视表已经消失,并且所有格式都已消失。 It was just hard values. 这只是硬性价值观。

I also perused this question and this question, but couldn't make it work. 我也仔细研究了这个问题和这个问题,但无法使其发挥作用。

How do I go about doing this simple task? 我该如何完成这项简单的任务?


I uploaded an example file which can be download here . 我上传了一个可以在这里下载的示例文件。 For your convenience, here are two dataframes to replace the data sheets: 为方便起见,这里有两个数据表来替换数据表:

df1 = pd.DataFrame({"Category": ["A", "B", "C", "D", "A"], "Value": [1, 2, 3, 4, 5]})
df2 = pd.DataFrame({"SKU": ["AB", "BB", "CB", "DB", "AB"], "No of Items": [3, 2, 7, 4, 12]})

As asked by a user below, I'm leaving here my failed try at his solution proposal (the pivots and all their formatting are gone, just their hardvalues remain). 正如下面一位用户所说的那样,我将离开这里,在他的解决方案提案上失败了尝试(枢轴和他们的所有格式都消失了,只是他们的硬评价仍然存在)。

import pandas as pd
from openpyxl import load_workbook
from openpyxl.utils.dataframe import dataframe_to_rows

df1 = pd.DataFrame([["A", 1], ["B", 2], ["C", 3], ["D", 4], ["A", 5]], columns=["Category", "Value"])
df2 = pd.DataFrame([["AB", 3], ["BB", 2], ["CB", 7], ["DB", 4], ["AB", 12]], columns=["SKU", "No of Items"])

wb = load_workbook("xlsx_file.xlsx")
sheets = ["Data 1", "Data 2"]

for (idx, df) in enumerate([df1, df2]):
    ws = wb.get_sheet_by_name(sheets[idx])
    rows = dataframe_to_rows(df)
    for (r_idx, row) in enumerate(rows):
        if r_idx != 0:
            for (c_idx, value) in enumerate(row[1:]):
                ws.cell(row=r_idx+1, column=c_idx+1, value=value)

wb.save("xlsx_file.xlsx")

The pandas side of things knows nothing about pivots so you should avoid using to_excel() and use the utilities that openpyxl provides for going from a dataframe to a worksheet and back. 大熊猫方面对于枢轴几乎一无所知所以你应该避免使用to_excel()并使用openpyxl提供的实用程序从数据框转到工作表并返回。

You may need to change the definition of the pivot tables in which case you are largely on your own: openpyxl should preserve the structure but provides no additional functions for managing them. 您可能需要更改数据透视表的定义,在这种情况下,您主要依靠自己:openpyxl应保留结构,但不提供其他功能来管理它们。 You will have to rely on the specification of pivot tables in ECMA 376 / ISO 29500 您将不得不依赖ECMA 376 / ISO 29500中的数据透视表规范

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM