简体   繁体   中英

Extracting XML tags values

I have a list of XML files that I need to extract 3 values from each file. The XML looks somewhat like :

<ClinicalDocument xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" moodCode="EVN" xmlns="urn:hl7-org:v3">
  <title>Summary</title>

  <recordTarget>
    <patientRole>
      <patient>
        <name>
          <given>John</given>
          <given>S</given>
          <family>Doe</family>
        </name>
       <birthTime value="19480503" />

I'm trying to extract given name, family name and birth time.

Initially I'm trying to print out the values using:

XmlDocument doc2 = new XmlDocument();
doc2.Load(@"Z:\\DATA\\file.XML");

XmlElement root = doc2.DocumentElement;
XmlNodeList list = root.GetElementsByTagName("name");
for (int i = 0; i < list.Count; i++)
{
    Console.WriteLine(list.Item(i).Value);
}

I'm not getting any value printed, but when I debug and check the inner values of "list" I can see what I need from that tag.

How can I extract the needed information?

Your code and all other answers ignore the default namespace xmlns="urn:hl7-org:v3"

I find Linq2Xml easier to use, so I'll post an answer using it..

var xDoc = XDocument.Load(filename);

var @namespace = "urn:hl7-org:v3";

XmlNamespaceManager namespaceManager = new XmlNamespaceManager(xDoc.CreateNavigator().NameTable);
namespaceManager.AddNamespace("ns", @namespace);
XNamespace ns = @namespace;

var names = xDoc.XPathSelectElements("//ns:patient/ns:name", namespaceManager).ToList();

var list = names.Select(p => new
                 {
                     Given = string.Join(", ", p.Elements(ns + "given").Select(x => (string)x)),
                     Family = (string)p.Element(ns + "family"),
                     BirthTime = new DateTime(1970,1,1).AddSeconds( (int)p.Parent.Element(ns + "birthTime").Attribute("value"))
                 })
           .ToList();

Try this instead:

XmlDocument doc2 = new XmlDocument();
doc2.Load(@"Path\To\XmlFile.xml");

XmlElement root = doc2.DocumentElement;
XmlNodeList list = root.GetElementsByTagName("name");

var names = list[0].ChildNodes;

for (int i = 0; i < names.Count; i++)
{
    Console.WriteLine(names[i].InnerText);
}

Output:

John
S
Doe

There are 2 issues with your code:

  • The first being that you were iterating around the name element, which only has a Count of 1 (as there is only one of these). That's why I included list[0],ChildNodes , to get all the children of the name element ( given , given and family ).

  • To retrieve the text inside each element, ("John", "S", "Doe"), you should use InnerText instead of Value

It's not clear from your example XML if there is only ever one <name> element or if there could be multiple. The following assumes there might be multiple. It also grabs the birthdate.

for (int i = 0; i < list.Count; i++)
{
    var xmlNode = list.Item(i).FirstChild;

    while (xmlNode != null)
    {
        Console.WriteLine(xmlNode.InnerText);
        xmlNode = xmlNode.NextSibling;
    }
}

XmlNodeList birthDates = root.GetElementsByTagName("birthTime");

for (int i = 0; i < list.Count; i++)
{
    Console.WriteLine(birthDates[i].Attributes["value"].Value);
}

If there are multiple <patient> elements in your xml you could do:

using System;
using System.Xml;
using System.Xml.Linq;
using System.Xml.XPath;

class Program
{
    static void Main()
    {
        var doc = XDocument.Load("a.xml");
        var nsm = new XmlNamespaceManager(new NameTable());
        nsm.AddNamespace("x", "urn:hl7-org:v3");
        var patients = doc.XPathSelectElements("//x:patient", nsm);
        foreach (var patient in patients)
        {
            Console.WriteLine(patient.XPathSelectElement("./x:name/x:given[1]", nsm).Value);
            Console.WriteLine(patient.XPathSelectElement("./x:name/x:given[2]", nsm).Value);
            Console.WriteLine(patient.XPathSelectElement("./x:name/x:family", nsm).Value);
            Console.WriteLine(patient.XPathSelectElement("./x:birthTime", nsm).Attribute("value").Value);
        }
    }
}

Why do you need to add the name space explicitly even if it's a default name space in the xml? see: this answer

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM