简体   繁体   English

使用Python解析XML

[英]parse XML using Python

<?xml version="1.0" encoding="UTF-8"?>
<WindowElement xmlns="http://windows.lbl.gov">
    <WindowElementType>System</WindowElementType>
    <Optical>
        <WavelengthData>
            <LayerNumber>System</LayerNumber>
            <Wavelength unit="Integral">Visible</Wavelength>
            <SourceSpectrum>CIE Illuminant D65 1nm.ssp</SourceSpectrum>
            <DetectorSpectrum>ASTM E308 1931 Y.dsp</DetectorSpectrum>
            <WavelengthDataBlock>
                <WavelengthDataDirection>Transmission Front</WavelengthDataDirection>
                <ColumnAngleBasis>LBNL/Klems Full</ColumnAngleBasis>
                <RowAngleBasis>LBNL/Klems Full</RowAngleBasis>
                <ScatteringDataType>BTDF</ScatteringDataType>
                <ScatteringData> 1, 2, 3, 3 
                             </ScatteringData>
            </WavelengthDataBlock>
        </WavelengthData>
    <WavelengthData>
        <LayerNumber>System</LayerNumber>
        <Wavelength unit="Integral">Visible</Wavelength>
        <SourceSpectrum>CIE Illuminant D65 1nm.ssp</SourceSpectrum>
        <DetectorSpectrum>ASTM E308 1931 Y.dsp</DetectorSpectrum>
        <WavelengthDataBlock>
            <WavelengthDataDirection>Transmission Back</WavelengthDataDirection>
            <ColumnAngleBasis>LBNL/Klems Full</ColumnAngleBasis>
            <RowAngleBasis>LBNL/Klems Full</RowAngleBasis>
            <ScatteringDataType>BTDF</ScatteringDataType>
            <ScatteringData> 555, 555
.......

How can I use Python to read 1, 2, 3, 3 in the ScatteringData element and change it to 5, 8, 8 如何使用Python来读1, 2, 3, 3在ScatteringData元素,并将其改变为5, 8, 8

There are two elements called ScatteringData and only the first one is changed. 有两个称为ScatteringData的元素,只有第一个被更改。

Thank you! 谢谢!

You should look at libraries that are available for working with XML in python. 您应该查看可用于在python中使用XML的库。 You could start here http://wiki.python.org/moin/PythonXml 您可以从这里开始http://wiki.python.org/moin/PythonXml

If you have to deal with xml's I suggest you take a look at lxml . 如果您必须处理xml,建议您看一下lxml

They say that lxml is the most feature-rich and easy-to-use library for processing XML and HTML in the Python language. 他们说lxml是功能最丰富,最易用的库,用于处理Python语言中的XML和HTML。 And it's faster and more robust than it's alternatives. 而且它比其他选择更快,更强大。 And do search in SO for lxml and others, because there are plenty of suggestion in previous questions about which one to use. 并在SO中搜索lxml等,因为在先前的问题中有很多关于使用哪个的建议。

from lxml import etree as ET

In [14]: root = ET.fromstring(datafragment)

In [15]: root.xpath('.//scatteringdata')[0].text='blah'

In [16]: print ET.tostring(root,pretty_print=True)
...
<scatteringdata>blah</scatteringdata>
...

if you have to make changes in more that one place, use a loop: 如果必须在一个以上的地方进行更改,请使用循环:

for i in root.xpath('.//scatteringdata'):
    i.text='smth'

Here's a solution using beautiful soup . 这是使用美丽汤的解决方案。 Basically it allows you to just walk down to the data and modify it as you see fit. 基本上,它使您可以浏览数据并根据需要进行修改。

import BeautifulSoup
soup = BeautifulSoup.BeautifulSoup(open("waves.xml"))
soup.scatteringdata.string = "5, 8, 8"
print soup.prettify()

Which outputs: 哪个输出:

  ...
  <scatteringdatatype>
    BTDF
   </scatteringdatatype>
   <scatteringdata>
    5, 8, 8
   </scatteringdata>
  </wavelengthdatablock>
  ...

If you wanted to take a look at the data first you can use 如果您想先看一下数据,可以使用

originalData = soup.scatteringdata.string 

and then process that as you will 然后按需处理

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM