简体   繁体   中英

C# retrieve multiple XML attributes from specific elements for manipulation

Beginner C# person here and I have been wrestling with a particular scenario that i'd like your help with. I am in need of processing an XML file whose content can be different each time I read it. Each time I read that document, I need to search for specific attributes within specific elements. The composition of these specific elements and attributes may vary from file to file. I have been successful in reading a single element and a single attribute within that element using examples I've found here on this forum which leverage Linq to XML. Please see below for an example of the XML I am working with.

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<w:document xmlns:wpc="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" xmlns:w15="http://schemas.microsoft.com/office/word/2012/wordml" xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup" xmlns:wpi="http://schemas.microsoft.com/office/word/2010/wordprocessingInk" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml" xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape" mc:Ignorable="w14 w15 wp14">
    <w:body>
        <w:p w14:paraId="2CBBB1B4" w14:textId="77777777" w:rsidR="00D9548A" w:rsidRDefault="00D9548A" w:rsidP="00ED7A0B"></w:p>
        <w:p w14:paraId="2CBBB1B5" w14:textId="77777777" w:rsidR="00ED548A" w:rsidRPr="00ED77B9" w:rsidRDefault="00C706DD" w:rsidP="00D9548A"></w:p>
        <w:pPr>
            <w:rPr>
                <w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial"></w:rFonts>
                <w:b></w:b>
                <w:sz w:val="40"></w:sz>
                <w:szCs w:val="40"></w:szCs>
            </w:rPr>
        </w:pPr>
        <w:r w:rsidRPr="00cool6F"></w:r>
        <w:tr w:rsidR="0029258E" w14:paraId="2CBBB242" w14:textId="77777777" w:rsidTr="0029258E"></w:tr>
    </w:body>
</w:document>

The interesting elements would be 'w:p', 'w:r', 'w:tr' or pretty much any element which has any of the following attributes : wrsidR, wrsidRDefault, wrsidRPr, rsidTr.

Ideally, I'd like a way to read in each of these values into some sort of list \\ array so that I can change the values and write those values right back to the attributes I grabbed them from.

The code I pieced together only returns 1 attribute 'rsidR' from the the 'w:p' elements in the file.

public static void foobar()
    {
        string strFile = @"C:\SourceFolder\SampleXML\document-test.xml";
        XDocument xDoc = XDocument.Load(strFile);
        XNamespace xNsp = xDoc.Root.Name.Namespace;

        var values = from rsids in xDoc.Descendants(xNsp + "p").Attributes(xNsp + "rsidR")
                     select rsids.Value.ToString();

        foreach (var v in values)
        {
            Console.WriteLine(v);
        }
    }

What is the best way to go after all the interesting attribute values in this file so that I can iterate through them, change the value and write it back to the XML file?

As always, I do appreciate your help!

Create a single (if possible) or multiple XPath(s) which match your attributes. Select the nodes, change the values (you have to clarify here if the value depends on the matched element), save back the XML.

https://msdn.microsoft.com/en-us/library/d271ytdx%28v=vs.110%29.aspx

    After you load the document in XmlDocument available in using System.Xml 
namespace, you can create mutliple XmlNodeList(s) and then fetch the values.

         public static void foobar()
         {
            string strFile = @"C:\SourceFolder\SampleXML\document-test.xml";
            XmlDocument doc = XmlDocument.Load(strFile);

            if (doc.SelectSingleNode("w:body") != null)
            {
               XmlNodeList nodes = doc.SelectNodes(".//w:pPr");
               foreach (XmlNode xn in nodes)
               {
                   XmlNodeList rPr = xn.SelectNodes("w:rPr");
                   foreach (XmlNode xnrPr in rPr)
                   {
                      if (xnrPr.SelectSingleNode("w:rFonts") != null)
                      {
                          Console.WriteLine(xnrPr.SelectSingleNode("w:rFonts").InnerText.ToString());
                      }
                   }
               }
            }

This ugly piece of code seems to do what I want it to do... I would gladly accept any streamlined \\ efficient pieces of code which do the same thing. The outstanding problems to solve for are :

1) get just the values of the attributes 2) aggregate the attribute values into an array which I can iterate over.

public static void shaqfoo()
    {
        string strFile = @"C:\SourceFolder\SampleXML\document (3).xml";
        using (XmlReader reader = XmlReader.Create(strFile))
        {
            while (reader.Read())
            {
                if (reader.IsStartElement())
                {
                    switch (reader.Name)
                    {
                        case "w:p":

                            string wp_rsidRAttrib = reader["w:rsidR"];
                            string wp_rsidRDefaultAttrib = reader["w:rsidRDefault"];
                            string wp_rsidPAttrib = reader["w:rsidP"];
                            string wp_rsidRPrAttrib = reader["w:rsidRPr"];
                            string wp_rsidTrAttrib = reader["w:rsidTr"];

                            if (wp_rsidRAttrib != null)
                            {
                                Console.WriteLine("w:p : w:rsidR : {0}", wp_rsidRAttrib);
                            }
                            if (wp_rsidRDefaultAttrib != null)
                            {
                                Console.WriteLine("w:p : w:rsidRDefault : {0}", wp_rsidRDefaultAttrib);
                            }
                            if (wp_rsidPAttrib != null)
                            {
                                Console.WriteLine("w:p : w:rsidP : {0}", wp_rsidPAttrib);
                            }
                            if (wp_rsidRPrAttrib != null)
                            {
                                Console.WriteLine("w:p : w:rsidRPr : {0}", wp_rsidRPrAttrib);
                            }
                            if (wp_rsidTrAttrib != null)
                            {
                                Console.WriteLine("w:p : w:rsidTr : {0}", wp_rsidTrAttrib);
                            }
                            break;

                        case "w:r":

                            string wr_rsidRAttrib = reader["w:rsidR"];
                            string wr_rsidRDefaultAttrib = reader["w:rsidRDefault"];
                            string wr_rsidPAttrib = reader["w:rsidP"];
                            string wr_rsidRPrAttrib = reader["w:rsidRPr"];
                            string wr_rsidTrAttrib = reader["w:rsidTr"];

                            if (wr_rsidRAttrib != null)
                            {
                                Console.WriteLine("w:r : w:rsidR : {0}", wr_rsidRAttrib);
                            }
                            if (wr_rsidRDefaultAttrib != null)
                            {
                                Console.WriteLine("w:r : w:rsidRDefault : {0}", wr_rsidRDefaultAttrib);
                            }
                            if (wr_rsidPAttrib != null)
                            {
                                Console.WriteLine("w:r : w:rsidP : {0}", wr_rsidPAttrib);
                            }
                            if (wr_rsidRPrAttrib != null)
                            {
                                Console.WriteLine("w:r : w:rsidRPr : {0}", wr_rsidRPrAttrib);
                            }
                            if (wr_rsidTrAttrib != null)
                            {
                                Console.WriteLine("w:r : w:rsidTr : {0}", wr_rsidTrAttrib);
                            }
                            break;

                        case "w:tr":

                            string wtr_rsidRAttrib = reader["w:rsidR"];
                            string wtr_rsidRDefaultAttrib = reader["w:rsidRDefault"];
                            string wtr_rsidPAttrib = reader["w:rsidP"];
                            string wtr_rsidRPrAttrib = reader["w:rsidRPr"];
                            string wtr_rsidTrAttrib = reader["w:rsidTr"];

                            if (wtr_rsidRAttrib != null)
                            {
                                Console.WriteLine("w:tr : w:rsidR : {0}", wtr_rsidRAttrib);
                            }
                            if (wtr_rsidRDefaultAttrib != null)
                            {
                                Console.WriteLine("w:tr : w:rsidRDefault : {0}", wtr_rsidRDefaultAttrib);
                            }
                            if (wtr_rsidPAttrib != null)
                            {
                                Console.WriteLine("w:tr : w:rsidP : {0}", wtr_rsidPAttrib);
                            }
                            if (wtr_rsidRPrAttrib != null)
                            {
                                Console.WriteLine("w:tr : w:rsidRPr : {0}", wtr_rsidRPrAttrib);
                            }
                            if (wtr_rsidTrAttrib != null)
                            {
                                Console.WriteLine("w:tr : w:rsidTr : {0}", wtr_rsidTrAttrib);
                            }
                            break;

                        case "w:sectPr":

                            string wsPr_rsidRAttrib = reader["w:rsidR"];
                            string wsPr_rsidRDefaultAttrib = reader["w:rsidRDefault"];
                            string wsPr_rsidPAttrib = reader["w:rsidP"];
                            string wsPr_rsidRPrAttrib = reader["w:rsidRPr"];
                            string wsPr_rsidTrAttrib = reader["w:rsidTr"];

                            if (wsPr_rsidRAttrib != null)
                            {
                                Console.WriteLine("w:sectPr : w:rsidR : {0}", wsPr_rsidRAttrib);
                            }
                            if (wsPr_rsidRDefaultAttrib != null)
                            {
                                Console.WriteLine("w:sectPr : w:rsidRDefault : {0}", wsPr_rsidRDefaultAttrib);
                            }
                            if (wsPr_rsidPAttrib != null)
                            {
                                Console.WriteLine("w:sectPr : w:rsidP : {0}", wsPr_rsidPAttrib);
                            }
                            if (wsPr_rsidRPrAttrib != null)
                            {
                                Console.WriteLine("w:sectPr : w:rsidRPr : {0}", wsPr_rsidRPrAttrib);
                            }
                            if (wsPr_rsidTrAttrib != null)
                            {
                                Console.WriteLine("w:sectPr : w:rsidTr : {0}", wsPr_rsidTrAttrib);
                            }
                            break;
                    }
                }
            }
        }
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM