简体   繁体   中英

Processing word document using OpenXML and C#

So I'm trying to populate the content controls in a word document by matching the Tag and populating the text within that content control.

The following displays in a MessageBox all of the tags I have in my document.

//Create a copy of the template file and open the document
File.Delete(hhscDocument);
File.Copy(hhscTemplate, hhscDocument, true);

//Open the word document specified by location
using (var document = WordprocessingDocument.Open(hhscDocument, true))
{

    //Change the document type from template to document
    var mainDocument = document.MainDocumentPart.Document;
    if (mainDocument.Body.Descendants<Tag>().Any())
    {
        //MessageBox.Show(mainDocument.Body.Descendants<Table>().Count().ToString());
        var tags = mainDocument.Body.Descendants<Tag>().ToList();
        var aString = string.Empty;
        foreach(var tag in tags)
        {
            aString += string.Format("{0}{1}", tag.Val, Environment.NewLine);
        }
        MessageBox.Show(aString);
    }
}

However when I try the following it doesn't work.

//Create a copy of the template file and open the document
File.Delete(hhscDocument);
File.Copy(hhscTemplate, hhscDocument, true);

//Open the word document specified by location
using (var document = WordprocessingDocument.Open(hhscDocument, true))
{

    //Change the document type from template to document
    var mainDocument = document.MainDocumentPart.Document;
    if (mainDocument.Body.Descendants<Tag>().Any())
    {
        //MessageBox.Show(mainDocument.Body.Descendants<Table>().Count().ToString());
        var tags = mainDocument.Body.Descendants<Tag>().ToList();
        var bString = string.Empty;
        foreach(var tag in tags)
        {
            bString += string.Format("{0}{1}", tag.Parent.GetFirstChild<Text>().Text, Environment.NewLine);
        }
        MessageBox.Show(bString);
    }
}

My objective in the end is if I match the appropriate tag I want to populate/change the text in the content control that tag belongs to.

So I basically used FirstChild and InnerXml to pick apart the documents XML contents. From there I developed the following that does what I need.

//Open the word document specified by location
using (var document = WordprocessingDocument.Open(hhscDocument, true))
{       
    var mainDocument = document.MainDocumentPart.Document;
    if (mainDocument.Body.Descendants<Tag>().Any())
    {
        //Find all elements(descendants) of type tag
        var tags = mainDocument.Body.Descendants<Tag>().ToList();

        //Foreach of these tags
        foreach (var tag in tags)
        {
            //Jump up two levels (.Parent.Parent) in the XML element and then jump down to the run level
            var run = tag.Parent.Parent.Descendants<Run>().ToList();

            //I access the 1st element because there is only one element in run
            run[0].GetFirstChild<Text>().Text = "<new_text_value>";
        }
    }
    mainDocument.Save();
}

This finds all the tags inside of your document and stores the elements in a list

var tags = mainDocument.Body.Descendants<Tag>().ToList();

This part of the code starts off at the tag part of the xml. From there I call parent twice to jump up two levels in the XML code so I can gain access to the Run level using descendants.

var run = tag.Parent.Parent.Descendants<Run>().ToList();

And last but not least the following code stores a new value into the text part of the PlainText Content control.

run[0].GetFirstChild<Text>().Text = "<new_text_value>";

Things that I noticed is the xml hierarchy is a funky thing. I find it easier to access these things from bottom up, hence why I started with the tags and moved up.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM