简体   繁体   中英

How to import an XML file into an Excel XLS file template using Python?

I have an Excel template file which has columns as defined within an XML. Manually I can right click over the template then XML > import, and select the XML file, and finally save the file.

How can I perform this task automatically programming it in Python ?

Example of XML file:

<DCPowerFlow>
    <branches>
        <branch>
            <busFrom name="bus_one" number="1" />
            <busTo name="bus_two" number="2" />
            <id>1</id>
            <rateA>1000</rateA>
            <resultPowerFlow>
                <probOverFlow>0.0</probOverFlow>
                <maxOverFlow>800</maxOverFlow>
            </resultPowerFlow>
        </branch>
        <branch>
            <busFrom name="bus_two" number="2" />
            <busTo name="bus_three" number="3" />
            <id>1</id>
            <rateA>1200</rateA>
            <resultPowerFlow>
                <probOverFlow>0.1</probOverFlow>
                <maxOverFlow>1300</maxOverFlow>
            </resultPowerFlow>
        </branch>
    </branches>
</DCPowerFlow>

To check for the manual task:

  1. Save the above example as an XML file.
  2. In order to create the Excel XLS template you can simply open the above XML example with Excel, ensure no data is on the template (delete data if within the import you added any) and save the file as XLS.
  3. Import the example XML file. Right click on the Excel template file created, then XML > Import, and select the XML example file.
  4. Save Template with data as a new XLS.

So what I need to do is to automate steps 3 and 4 .

Have you tried using BeautifulSoup and Pandas? Note that the parser I use in the following script requires you to have lxml installed already. If you don't have it just pip install lxml.

import pandas as pd
from bs4 import BeautifulSoup

file = open("file.xml", 'r')
soup = BeautifulSoup(file, 'lxml')
df = pd.DataFrame({'ids': [x.text for x in soup.find_all('id')]})
df.to_excel('data.xls')

While you will have to figure out how you want to parse your file, that will give you the tools that you need. If you need more information about how to parse the file, try visiting the BeautifulSoup documentation . Using this code you can loop through all the files that you are interested in and parsing them into dataframes, then exporting them using the to_excel method.

Finally I could figure this out using win32com.client module.

I used the following code successfully to import an xml to an existing Excel xlsx file I use as template, and then save it with a different name:

import win32com.client as win32
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.Open("D:/tmp/template.xlsx")
wb.XmlImport("D:/tmp/result.xml")
wb.SaveAs("D:\\tmp\\result.xlsx")
wb.Close()

Methods for Excel workbooks can be found here . Also I had to take into account that the saveAs method doesn't support forward slashes .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM