I have a list of XML files that I need to extract 3 values from each file. The XML looks somewhat like :
<ClinicalDocument xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" moodCode="EVN" xmlns="urn:hl7-org:v3">
<title>Summary</title>
<recordTarget>
<patientRole>
<patient>
<name>
<given>John</given>
<given>S</given>
<family>Doe</family>
</name>
<birthTime value="19480503" />
I'm trying to extract given name, family name and birth time.
Initially I'm trying to print out the values using:
XmlDocument doc2 = new XmlDocument();
doc2.Load(@"Z:\\DATA\\file.XML");
XmlElement root = doc2.DocumentElement;
XmlNodeList list = root.GetElementsByTagName("name");
for (int i = 0; i < list.Count; i++)
{
Console.WriteLine(list.Item(i).Value);
}
I'm not getting any value printed, but when I debug and check the inner values of "list" I can see what I need from that tag.
How can I extract the needed information?
Your code and all other answers ignore the default namespace xmlns="urn:hl7-org:v3"
I find Linq2Xml easier to use, so I'll post an answer using it..
var xDoc = XDocument.Load(filename);
var @namespace = "urn:hl7-org:v3";
XmlNamespaceManager namespaceManager = new XmlNamespaceManager(xDoc.CreateNavigator().NameTable);
namespaceManager.AddNamespace("ns", @namespace);
XNamespace ns = @namespace;
var names = xDoc.XPathSelectElements("//ns:patient/ns:name", namespaceManager).ToList();
var list = names.Select(p => new
{
Given = string.Join(", ", p.Elements(ns + "given").Select(x => (string)x)),
Family = (string)p.Element(ns + "family"),
BirthTime = new DateTime(1970,1,1).AddSeconds( (int)p.Parent.Element(ns + "birthTime").Attribute("value"))
})
.ToList();
Try this instead:
XmlDocument doc2 = new XmlDocument();
doc2.Load(@"Path\To\XmlFile.xml");
XmlElement root = doc2.DocumentElement;
XmlNodeList list = root.GetElementsByTagName("name");
var names = list[0].ChildNodes;
for (int i = 0; i < names.Count; i++)
{
Console.WriteLine(names[i].InnerText);
}
Output:
John
S
Doe
There are 2 issues with your code:
The first being that you were iterating around the name
element, which only has a Count
of 1 (as there is only one of these). That's why I included list[0],ChildNodes
, to get all the children of the name
element ( given
, given
and family
).
To retrieve the text inside each element, ("John", "S", "Doe"), you should use InnerText
instead of Value
It's not clear from your example XML if there is only ever one <name>
element or if there could be multiple. The following assumes there might be multiple. It also grabs the birthdate.
for (int i = 0; i < list.Count; i++)
{
var xmlNode = list.Item(i).FirstChild;
while (xmlNode != null)
{
Console.WriteLine(xmlNode.InnerText);
xmlNode = xmlNode.NextSibling;
}
}
XmlNodeList birthDates = root.GetElementsByTagName("birthTime");
for (int i = 0; i < list.Count; i++)
{
Console.WriteLine(birthDates[i].Attributes["value"].Value);
}
If there are multiple <patient>
elements in your xml you could do:
using System;
using System.Xml;
using System.Xml.Linq;
using System.Xml.XPath;
class Program
{
static void Main()
{
var doc = XDocument.Load("a.xml");
var nsm = new XmlNamespaceManager(new NameTable());
nsm.AddNamespace("x", "urn:hl7-org:v3");
var patients = doc.XPathSelectElements("//x:patient", nsm);
foreach (var patient in patients)
{
Console.WriteLine(patient.XPathSelectElement("./x:name/x:given[1]", nsm).Value);
Console.WriteLine(patient.XPathSelectElement("./x:name/x:given[2]", nsm).Value);
Console.WriteLine(patient.XPathSelectElement("./x:name/x:family", nsm).Value);
Console.WriteLine(patient.XPathSelectElement("./x:birthTime", nsm).Attribute("value").Value);
}
}
}
Why do you need to add the name space explicitly even if it's a default name space in the xml? see: this answer
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.