简体   繁体   中英

Python - Is there a good intermediary format to export mixed data to multiple filetypes?

I have a model which generates output in the form of numpy arrays, text and plots. It currently holds this output as a dictionary.

There are requirements for the output to be presented in a variety of formats. Particularly, PDF, Word and Excel.

My solution has been to write all data to an HTML string, and export the HTML to a PDF using weasyprint. I would then export the table sections of the HTML to Excel. This works okay, but it's messy.

I was wondering if there was an easier way to do this? In my mind, perhaps there is a module which would allow you to store the information in a dictionary, and dictate its data type, then a process would handle it's formatting and exporting to various formats.

I wanted to answer my own question, to demonstrate what I implemented as a solution.

Because the data formats were multimedia (text, numbers, plots), I made two approaches:

  • A Report class which had the capability of exporting txt, html, docx and pdf
  • A Workbook class, which had the capability of export xlsx and csv

Both classes inherited the same data structure, which was a nested dictionary containing numbers, and metadata. The Report class then grabbed additional text, and created plots from the data.

For example, the data resembled this structure:

data = {
    "Some Label":{
        "An Item":[1,2,3,4,5]
    }
}

The Report class built an HTML string using Dominate , and could either export as HTML by rendering this, PDF by feeding the rendered HTML into WeasyPrint , or to Docx (or some other format theoretically) by converting the rendered HTML to Docx via PyPandoc

The Workbook class iterated through dictionaries of values and wrote groups of these values to Pandas dataframes, and exported them to a workbook using pd.ExcelWriter . The same dataframes could be exported to csv, and compressed into a zip file using an adapted solution found here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM