简体   繁体   English

用 python 和 xml.etree.ElementTree 解析 XML

[英]Parsing XML with python and xml.etree.ElementTree

I am trying to take xml data from the BambooHR api and then create users in our company google account.我正在尝试从 BambooHR api 中获取 xml 数据,然后在我们公司的 Google 帐户中创建用户。 Right now I am struggling to get through the xml.现在我正在努力通过 xml。 Every example I have seen has data with different tag names where mine are the same('field) but have an ID attached to them我见过的每个示例都有具有不同标签名称的数据,其中我的标签名称相同('字段)但附加了一个 ID

Here's my xml response这是我的 xml 回复

<?xml version="1.0"?>
<directory>
 <fieldset>
  <field id="displayName">Display name</field>
  <field id="firstName">First name</field>
  <field id="lastName">Last name</field>
  <field id="preferredName">Preferred name</field>
  <field id="jobTitle">Job title</field>
  <field id="mobilePhone">Mobile Phone</field>
  <field id="workEmail">Work Email</field>
  <field id="department">Department</field>
  <field id="location">Location</field>
  <field id="division">Division</field>
  <field id="linkedIn">LinkedIn URL</field>
  <field id="supervisor">Manager</field>
  <field id="photoUploaded">Employee photo</field>
  <field id="photoUrl">Photo URL</field>
  <field id="canUploadPhoto">Can Upload Photo</field>
 </fieldset>
 <employees>
  <employee id="379">
   <field id="displayName">test one</field>
   <field id="firstName">test</field>
   <field id="lastName">one</field>
   <field id="preferredName"></field>
   <field id="jobTitle">Assistant</field>
   <field id="mobilePhone">123456789</field>
   <field id="workEmail">test.one@email.com</field>
   <field id="department">Recruitment</field>
   <field id="location">Remote</field>
   <field id="division">company name</field>
   <field id="linkedIn"></field>
   <field id="supervisor">test supervisor</field>
   <field id="photoUploaded">true</field>
   <field id="photoUrl">"https://image.com"</field>
   <field id="canUploadPhoto">yes</field>
  </employee>
  <employee id="398">
   <field id="displayName">tester two</field>
   <field id="firstName">tester</field>
   <field id="lastName">two</field>
   <field id="preferredName"></field>
   <field id="jobTitle">Recruitment</field>
   <field id="mobilePhone">987654321</field>
   <field id="workEmail">tester.two@company.com</field>
   <field id="department">Recruitment</field>
   <field id="location">Remote</field>
   <field id="division">company</field>
   <field id="linkedIn"></field>
   <field id="supervisor">test supervisor</field>
   <field id="photoUploaded">true</field>
   <field id="photoUrl">"https://image.com"</field>
   <field id="canUploadPhoto">yes</field>
  </employee>
 </employees>
</directory>

Here's some code Ive been working on.这是我一直在处理的一些代码。 I pass the function an xml file I have created and saved locally after I pull it from the api我通过 function 一个 xml 文件,我从 api 中提取它后在本地创建并保存它

def parse_XML(xml_file):
    tree = ET.parse(xml_file)
    root = tree.getroot()
    print('printing root....')
    print(root.tag, root.attrib) #directory {}

    for emp in root.iter('employee'):
        for employee in emp:
            #if employee work email == '':
                # get all users data ready to send to google to create a new account

So im trying to see if the users work email is == "" which means they dont have a google account and then I will send the users info to google to create an account所以我想看看用户是否工作 email 是 == "" 这意味着他们没有谷歌帐户,然后我会将用户信息发送给谷歌以创建一个帐户

the problem ive ran into is since the tags are all the same Im having trouble getting the value of the tags我遇到的问题是因为标签都是一样的我无法获取标签的值

If you can help make all the users into a list of employees or recommend the best way to accomplish this that would be great.如果您可以帮助将所有用户纳入员工列表,或者推荐完成此任务的最佳方法,那就太好了。 Or ask questions and I can try to specify things或者问问题,我可以尝试具体说明

for emp in root.iter('employee') already iterates over all the <employee> nodes. for emp in root.iter('employee')已经遍历了所有<employee>节点。 All you have left to do is iterate all the <field> nodes and check if the content of the workEmail field is empty:您剩下要做的就是迭代所有<field>节点并检查workEmail字段的内容是否为空:

for emp in root.iter('employee'):
    for field in emp.iter('field'):
        if field.attrib['id'] == 'workEmail' and field.text == '':
            print('this employee has no email')

UPDATE: Since you are interested in few specific <field> nodes it may be beneficial to use xpath to get the ones you need instead of the inner loop:更新:由于您对几个特定的<field>节点感兴趣,因此使用 xpath 来获取您需要的节点而不是内部循环可能会有所帮助:

for emp in root.iter('employee'):
    email = emp.find('field[@id="workEmail"]').text
    if email == '':
        first_name = emp.find('field[@id="firstName"]').text
        last_name = emp.find('field[@id="lastName"]').text
        # do whatever with these details

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM