简体   繁体   中英

How to take data from multiple nodes in a xml file use them in different text file using c#?

I have a sample xml file which is like below

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD with OASIS Tables v1.0 20120330//EN" "JATS-journalpublishing-oasis-article1.dtd">
<article article-type="proceedings" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:oasis="http://www.niso.org/standards/z39-96/ns/oasis-exchange/table">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id"/>
<journal-title-group>
<journal-title>Eleventh International Conference on Correlation Optics</journal-title>
</journal-title-group>
<issn pub-type="epub">0277-786X</issn>
<publisher>
<publisher-name>SPIE</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.236/12.205210</article-id>
<title-group>
<article-title>So you think you can dance?</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Cena</surname>
<given-names>John</given-names>
</name>
<xref ref-type="aff" rid="a1"><sup>a</sup></xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Pal</surname>
<given-names>G.S.</given-names>
</name>
<xref ref-type="aff" rid="a2"><sup>b</sup></xref>
</contrib>
<aff id="a1"><label><sup>a</sup></label>CNRS, France</aff>
<aff id="a2"><label><sup>b</sup></label>MIT, USA</aff>
</contrib-group>
</article-meta>
</front>
<body>
<sec id="S1">
<label>1.</label>
<p>Today is your lucky day</p>
</sec>
<sec id="S2">
<label>2.</label>
<p>Today is not so lucky</p>
</sec>
</body>
</article>

I'm tying to get contents of some of the nodes(first node found) and put them in variables and then use regex replace using them in a different TXT file in efficient manner. I'm trying something like

 XDocument doc=XDocument.Load(@"D:\MyFiles\1235-12-3053\230\124\124.xml",LoadOptions.PreserveWhitespace);
                var s=from a in doc.Descendants("surname")
                    select a.First();
                var l=from x in doc.Descendants("label")
                    select x.First();
            ... so on
    File.WriteAllText(@"C:\Desktop\text.txt", Regex.Replace(File.ReadAllText(@"C:\Desktop\text.txt"), @"<a>[^<]+</a>", @"<a>s</a>"));
    File.WriteAllText(@"C:\Desktop\text.txt", Regex.Replace(File.ReadAllText(@"C:\Desktop\text.txt"), @"<b>[^<]+</b>", @"<b>l</b>"));
    ... so on

But First() method is giving an error, and also using WriteAllText many times, is that a good practice? Can I do multiple replaces in one go?

Instead of loading and saving the file to the hard drive multiple times, you should consider loading the text to a variable (string).

When i memory you can use regex to search and replace for multiple different patterns as you wish, and when completely done save the file.

Try something like this

XDocument doc=XDocument.Load(file,LoadOptions.PreserveWhitespace);
    var s=(from a in doc.Descendants("surname")
     select a).First().Value;
    var l=(from x in doc.Descendants("label")
     select x).First().Value;
 string text = File.ReadAllText(file);
            text = Regex.Replace(text,  @"<a>[^<]+</a>", @"<a>"+s+@"</a>");
            text = Regex.Replace(text,  @"<b>[^<]+</b>", @"<b>"+l+@"</b>");
            File.WriteAllText(file, text);

This should do if I understand correctly what you wanted from this program.

There is also a method called UpdateText I believe, to do the updating of the file but I'm not that familiar with it...maybe someone else can help you on that method.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM