简体   繁体   English

将 pandas dataframe 逐列写入现有的 excel 模板跳过 ZBF57C906FA7D25856D07372E 中的表格列

[英]Write pandas dataframe column by column to existing excel template skipping excel sheet columns that have formulas in it

I am super stuck since a day or two and give up on this.我从一两天开始就被卡住了,然后放弃了。 I am new to using python with excel.我是使用 python 和 excel 的新手。

Here is my scenario ;这是我的场景 I am planning to write a pandas dataframe to an existing excel sheet.我计划将 pandas dataframe 写入现有的 excel 表。 The sheet has 50 columns in it.工作表中有 50 列。 2 of the columns are derived (formula columns developed from other columns through computations) and fall in between at position 48 and 50 respectively among those 50 columns.其中 2 列是派生的(公式列通过计算从其他列发展而来),并且在这 50 列中分别位于 position 48 和 50 之间。 Hence, my dataframe should write to this excel sheet skipping position 48th column and 50th column.因此,我的 dataframe 应该写入此 excel 表,跳过 position 第 48 列和第 50 列。 I am using win32com and pandas to do my job.我正在使用 win32com 和 pandas 来完成我的工作。

Problem statement :问题陈述

But as I write to dataframe;但是当我写信给 dataframe;

  1. only the first record from dataframe gets written for entire excel sheet range.只有来自 dataframe 的第一条记录被写入整个 excel 工作表范围。 why am I not pasting entire pandas series got from column of dataframe?为什么我不粘贴从 dataframe 列获得的整个 pandas 系列?

  2. how can I handle the "None" and "NaN" set to blanks '' for excel in this code?如何处理此代码中 excel 的“无”和“NaN”设置为空白''? (optional) (可选的)

Code : The below code is a snippet (from entire code) of how I am writing my dataframe to excel.代码:下面的代码是我如何将 dataframe 写入 excel 的片段(来自整个代码)。

  1. "Report_data" is the pandas dataframe. “Report_data”是 pandas dataframe。 This is also the name of sheet in excel I am writing to.这也是我正在写信的 excel 中的工作表名称。

  2. Excel_Template_File has the file path for my excel template file where the sheet "Report Data" is for me to write my dataframe from python Excel_Template_File 有我的 excel 模板文件的文件路径,其中“报告数据”表是让我从 python 写我的 dataframe

excel_app = client.dynamic.Dispatch("Excel.Application") # Initialize instance
excel_app.Interactive = False
excel_app.Visible = False

wb = excel_app.Workbooks.Open(Excel_Template_File)
ws = wb.Worksheets('Report Data')

for col_idx in range(0,len(Report_Data.columns)):
    col_lst = Report_Data.columns.values.tolist()
    
    if col_lst[col_idx] in [col_lst[-1], col_lst[-3]]:
        continue;
    else:
        print(col_lst[col_idx])
        col_vals = Report_Data.iloc[:,col_idx] # Copy values of column from dataframe as series
        print('mapping to cell locations...')
        
        xl_col_idx = col_idx + 1
        try: # Write column by column to avoid formula columns
            ws.Range(ws.Cells(2, xl_col_idx), 
            ws.Cells(1+len(col_vals),xl_col_idx)).Value = col_vals.values
        except pywintypes.com_error:
            print("Error")

wb.SaveAs('C:\\somepath\\Excel_'+time.strftime("%Y%m%d-%H%M%S")+'.xlsx') # Save our work
wb.Close(True)
excel_app.quit()

The try block is the one that does writing stuff to excel at the given range. try 块是在给定范围内向 excel 写入内容的块。

Validations done :验证完成

  1. I tried df.to_excel() but it wipes out my entire excel template clean which I cannot afford since there are more than 30-40 sheets in this excel made of Pivot tables and charts generated from this "Report Data" sheet我尝试了 df.to_excel() 但它清除了我的整个 excel 模板,这是我买不起的,因为在这个 excel 中有超过 30-40 张表格,由 Z6B8F027B6B033C508AF1A92B 表格和图表从这个“Report6F027B6B033C508AF1A92B”表格生成

  2. Apart from pywin32com I am unable to leverage any other excel library as there are multiple excel files from where I am pulling the data to make pandas dataframe to be finally written to sheet "Report Data" in excel. Apart from pywin32com I am unable to leverage any other excel library as there are multiple excel files from where I am pulling the data to make pandas dataframe to be finally written to sheet "Report Data" in excel. As the excels I am pulling from are located on network drive win32com suites it.由于我从中提取的优秀作品位于网络驱动器 win32com 套件上。 openpyxl command load_workbok() too takes forever to open in my case.在我的情况下,openpyxl 命令 load_workbok() 也需要永远打开。

  3. The dataframe has correct data as I checked it by printing it with.head(). dataframe 具有正确的数据,因为我通过使用.head() 打印它来检查它。 Thus, excels pulled have been concatenated and merged correctly.因此,提取的 excel 已正确连接和合并。

  4. The file size is about 200 MB.文件大小约为 200 MB。

Conclusion & expected output :结论和预期 output

Thus kindly assist in dumping my pandas series(or array) to respective column positions in excel.因此,请协助将我的 pandas 系列(或阵列)转储到 excel 中的相应列位置。 Writing column by column to excel from df从df逐列写入excel

Since the above code neither erases the derived column formulas at position 48 and 50 and neither does it wipes of excel clean as in case of to_excel由于上述代码既不会擦除 position 48 和 50 处的派生列公式,也不会像 to_excel 的情况一样擦除 excel

The issue is that the Range.Value property can take a 1-D vector of values or a 2-D array.问题是Range.Value属性可以采用一维值向量或二维数组。 If Value receives a 1-D vector, Excel assumes it is a single row (NOT a column).如果Value接收一维向量,Excel 假定它是单行(不是列)。 To set the values by column, you need to convert the vector to an array.要按列设置值,您需要将向量转换为数组。 A simplified example:一个简化的例子:

import pandas as pd
import win32com.client as wc

df = pd.DataFrame([[1,4,7],[2,5,8],[3,6,9]],columns=['A','B','C'])

print(df.head())

xl = wc.Dispatch('Excel.Application')
xl.Visible=True

wb = xl.Workbooks.Add()
ws = wb.Worksheets(1)

for col_num in range(0,len(df.columns)):
    #Convert 1D vector to 2D array
    vals = [[v] for v in df.iloc[:,col_num].values]
    ws.Range(ws.Cells(1,col_num+1),ws.Cells(len(vals),col_num+1)).Value = vals

input("Press Enter to continue...")

wb.Close(False)
xl.Quit()

Python output: Python output:

   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9
Press Enter to continue...

Excel sheet: Excel表: 在此处输入图像描述

As an aside, it might be more efficient to set the values as two blocks, ie dataframe cols 0-46 first df.iloc[:,range(0,47)].values , then col 48 separately.顺便说一句,将值设置为两个块可能更有效,即 dataframe cols 0-46 先是df.iloc[:,range(0,47)].values ,然后是 col 48 。 The values from the first block will already be a 2-D array.第一个块的values已经是一个二维数组。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在依赖于行/列值的 Pandas DataFrame 上编写 Excel 公式 - Writing excel formulas on pandas DataFrame that depends on Row/Column values 如何在忽略索引列值的情况下在现有 Excel 工作表下方写入数据框? - How to write dataframe below an existing Excel Sheet with ignoring index column value? Python Pandas - 如何在 Excel 工作表的特定列中写入 - Python Pandas - How to write in a specific column in an Excel Sheet 在 pandas dataframe 上获取 Excel 工作表内容,但使用公式而不是值 - get Excel sheet content on a pandas dataframe but with formulas not values 无法将Pandas Dataframe附加到现有的Excel工作表 - unable to append pandas Dataframe to existing excel sheet Python Excel 在现有工作表中写入 DataFrame - Python Excel write a DataFrame in an existing sheet 如何在不覆盖其他列的情况下从 Pandas DataFrame 写入特定列到 Excel? - How to write a specific column from Pandas DataFrame to Excel without overwriting other columns? pandas dataframe 到现有的 excel 工作表并用 openpyxl 换行列文本 - pandas dataframe to existing excel worksheet and wrap column text with openpyxl 使用 Pandas 将 Dataframe 行写入 excel 表 - Write Dataframe row to excel sheet using Pandas 使用 pandas 和 output 将 Excel 表中的列取消隐藏到 Z6A8064B5DF4794555700553 - Unhide columns in an Excel sheet with pandas and output into a dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM