简体   繁体   English

从 Python 网络应用程序:将数据插入电子表格(例如 LibreOffice / Excel),计算并另存为 pdf

[英]From Python web app: insert data into spreadsheet (e.g. LibreOffice / Excel), calculate and save as pdf

I am facing the problem, that I would like to push data (one large dataframe and one image) from my python web app (running on Tornado Webserver and Ubuntu) into a spreadsheet, calculate, save as pdf and the deliver to the frontend.我面临的问题是,我想将 python 网络应用程序(在 Tornado Webserver 和 Ubuntu 上运行)中的数据(一个大数据框和一个图像)推送到电子表格中,计算,另存为 pdf 并交付到前端。

I took a look at several libs like openpyxl for writing Sheets in MS Excel, but that would solve just one part.我查看了几个像 openpyxl 这样的库,用于在 MS Excel 中编写表格,但这只能解决一个问题。 I was thinking about using LibreOffice and pyoo, but it seems that I need the same python version on my backend as shipped with LibeOffice when importing pyuno.我正在考虑使用 LibreOffice 和 pyoo,但在导入 pyuno 时,我的后端似乎需要与 LibeOffice 附带的相同的 python 版本。

Does somebody has solved a similar issue and have a recommendation how to solve this?有人已经解决了类似的问题,并建议如何解决这个问题?

Thanks谢谢

I came up to a let's say not pretty, but rare solution that works very flexible for me.我想出了一个让我们说不漂亮但罕见的解决方案,它对我来说非常灵活。

  • use openpyxl to open an existing Excel workbook that includes layout (Template)使用 openpyxl 打开包含布局(模板)的现有 Excel 工作簿
  • insert the dataframe into a separate sheet in that workbook将数据框插入到该工作簿中的单独工作表中
  • use openpyxl to save as temporary_file.xlsx使用 openpyxl 保存为temporary_file.xlsx
  • call LibeOffice with --headless --convert-to pdf temporary_file.xlsx使用 --headless --convert-to pdf 临时文件.xlsx 调用 LibeOffice

While executing the last call, all integrated formulas are recalculated/updated and the pdf is created ( you have to configure calc so that auto calc is enabled when files are opened )在执行最后一次调用时,重新计算/更新所有集成公式并创建 pdf(您必须配置 calc 以便在打开文件时启用自动计算

  • deliver pdf to frontend or process as you like根据需要将 pdf 交付给前端或处理
  • delete temporary_file.xlsx删除临时文件.xlsx

import openpyxl
import pandas as pd
from subprocess import call

d   = {'col1': [1, 2], 'col2': [3, 4]}
df  = pd.DataFrame(data=d)
now = datetime.datetime.now().strftime("%Y%m%d_%H%M_%f")

wb_template_name = 'Template.xlsx'
wb_temp_name     = now + wb_template_name
wb               = openpyxl.load_workbook(wb_template_name)
ws               = wb['dataframe_sheet']
pdf_convert_cmd  = 'soffice --headless --convert-to pdf ' + wb_temp_name

for r in dataframe_to_rows(df, index=True, header=True):
   ws.append(r)
wb.save(wb_temp_name)
call(pdf_convert_cmd, shell=True)

The reason why I'm doing this, is that I would like to be able to style the layout of the pdf independently from the data.我这样做的原因是我希望能够独立于数据来设置 pdf 的布局。 I use named ranges or lookups that are referenced to the separate dataframe-sheet in excel.我使用引用到 excel 中单独数据框表的命名范围或查找。

I didn't try the image insertion yet, but this should work similar.我还没有尝试插入图像,但这应该类似。 I think there could be a way to increase the performance while simply dump the dataframe into the xlsx file (which is a zipped file of xmls), so that you don't need openpyxl.我认为可能有一种方法可以提高性能,同时只需将数据帧转储到 xlsx 文件(它是 xmls 的压缩文件)中,这样您就不需要 openpyxl。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用pandas web.DataReader从整个索引(例如DJIA)获取数据 - Grabbing data from entire index (e.g., DJIA) using pandas web.DataReader MySQL从python插入/更新,从Excel电子表格获取数据 - MySQL Insert / update from python, data from excel spreadsheet 如何将数据结构(例如 dict)从单独的文件导入或包含到 Python 文件中 - How to import or include data structures (e.g. a dict) into a Python file from a separate file 从 tensorflow 数据集中提取数据(例如到 numpy) - Extract data from tensorflow dataset (e.g. to numpy) 如何从环境中接收数据,例如bash - How to receive data from the environment e.g. bash 如何从 Python 中不包含任何图像文件扩展名(例如 png)的 URL 保存图像? - How to save an image from an URL which does not contain any image file extensions (e.g. png) in Python? 如何在 Python (Mac) 中将扫描的 PDF 转换为可搜索的 PDF? 例如 OCRMYPDF 模块 - How do I convert scanned PDF into searchable PDF in Python (Mac)? e.g. OCRMYPDF module Python - 在没有数据连接的情况下保存 Excel 电子表格 - Python - Save Excel Spreadsheet without the Data Connection 使用变量表和字段名称的web2py插入,例如dict - web2py insert with variable table and field name e.g. with dict 如何从命令行Python脚本中保存LibreOffice Calc电子表格中的所有工作表 - How can I save ALL sheets in a LibreOffice Calc Spreadsheet from a command-line Python script
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM