简体   繁体   English

提取XML数据,对其进行修改并将其存储在excel文件中

[英]Extracting XML data, modifying it and storing in excel file

I am new to asp.net. 我是asp.net的新手。 I have an xml file as follows: 我有一个xml文件,如下所示:

<?xml version="1.0" encoding="iso-8859-1" ?>
<newsitem itemid="10000" id="root" date="1996-08-22" xml:lang="en">
  <title>CHINA: China says hopeful on global nuclear test ban.</title>
  <headline>China says hopeful on global nuclear test ban.</headline>
  <dateline>BEIJING 1996-08-22</dateline>
  <text>
    <p>China said on Thursday it was hopeful a global nuclear test ban treaty could be approved by the U.N. </p>
    <p>&quot;China hopes that the treaty could be open for signature by the end of the year and that there .</p>
    </text>
.....continue

The xml file is huge, I want that..i have to Process only terms in the ‹title› and ‹text› fields of each news item. xml文件很大,我希望..i必须仅处理每个新闻项的“标题”和“文本”字段中的术语。 Also, I have to count the frequency of those words. 另外,我必须计算这些单词的出现频率。

I tried to extract the text from title and text field. 我试图从标题和文本字段中提取文本。 I got data for title field but not getting for text field. 我得到标题字段的数据,但没有得到文本字段的数据。 Moreover, in the title field, I am not getting unique elements, the elements are getting repeated. 此外,在标题字段中,我没有获得独特的元素,这些元素正在重复。 Please help me. 请帮我。

The code I tried is : 我试过的代码是:

 string filename = Server.MapPath("demo1.xml");
        XmlTextReader reader = new XmlTextReader(filename);
        XmlNodeType type;

        while (reader.Read())
        {
            type = reader.NodeType;

            if (type == XmlNodeType.Element)
            {
                if (reader.Name == "text")
                {
                    reader.Read();
                    TextBox1.Text = reader.Value;
                }

              if (reader.Name == "title")
                {
                    reader.Read();
                    ListBox1.Items.Add(reader.Value);

                }

            }
        }
        reader.Close();
    }

In the listbox, I am getting data but in text box i am not getting data. 在列表框中,我正在获取数据,但是在文本框中,我没有获取数据。 Moreover, i need to store huge xml data and count the the number of each words. 此外,我需要存储大量的xml数据并计算每个单词的数量。 for example china-2, says-1 and store it in excel. 例如china-2,said-1,并将其存储在excel中。 Would you tell me should i use string builder and if yes, how ? 你能告诉我我应该使用字符串生成器吗?

This should get you started: 这应该使您开始:

var xml = XElement.Load(new FileStream(@"C:\TEMP\TEST.xml", FileMode.Open));

var titleElement = xml.Elements("title").SingleOrDefault();
var title = titleElement != null ? titleElement.Value : String.Empty;
var textElement = xml.Elements("text").SingleOrDefault();
var text = textElement != null
               ? String.Join(String.Empty, textElement.Elements()
                                                      .Select(t => t.Value))
               : String.Empty;

I am using your above XML snippet as an example. 我以您上面的XML代码段为例。 You'll want to adapt it to your final XML structure, but I think with the above pattern you should be able to make it suit your needs. 您可能希望使其适应最终的XML结构,但我认为,采用上述模式,您应该能够使其适应您的需求。

The variable title will be the text of the <title> element and the variable text will be the concatenated text of all elements found within the <text> element. 变量title将是<title>元素的text ,变量text将是在<text>元素中找到的所有元素的串联文本。 In this way you end up with String variables which you can perform standard text processing on to achieve your goal of getting word counts, etc. 这样,您最终得到了String变量,可以对它们进行标准的文本处理,以实现获取字数等目标。

Hope this helps! 希望这可以帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM