簡體   English   中英

如何從包含某些嵌入數據的XML文檔中填充C#類?

[英]How can I populate C# classes from an XML document that has some embedded data?

我有一個返回此API的API:

http://services.aonaware.com/DictService/DictService.asmx?op=DefineInDict

<?xml version="1.0" encoding="utf-8"?>
<WordDefinition xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://services.aonaware.com/webservices/">
  <Word>abandon</Word>
  <Definitions>
    <Definition>
      <Word>abandon</Word>
      <Dictionary>
        <Id>wn</Id>
        <Name>WordNet (r) 2.0</Name>
      </Dictionary>
      <WordDefinition>abandon
     n 1: the trait of lacking restraint or control; freedom from
          inhibition or worry; "she danced with abandon" [syn: {wantonness},
           {unconstraint}]
     2: a feeling of extreme emotional intensity; "the wildness of
        his anger" [syn: {wildness}]
     v 1: forsake, leave behind; "We abandoned the old car in the
          empty parking lot"
     2: stop maintaining or insisting on; of ideas, claims, etc.;
        "He abandoned the thought of asking for her hand in
        marriage"; "Both sides have to give up some calims in
        these negociations" [syn: {give up}]
     3: give up with the intent of never claiming again; "Abandon
        your life to God"; "She gave up her children to her
        ex-husband when she moved to Tahiti"; "We gave the
        drowning victim up for dead" [syn: {give up}]
     4: leave behind empty; move out of; "You must vacate your
        office by tonight" [syn: {vacate}, {empty}]
     5: leave someone who needs or counts on you; leave in the
        lurch; "The mother deserted her children" [syn: {forsake},
         {desolate}, {desert}]
</WordDefinition>
    </Definition>
  </Definitions>
</WordDefinition>

這是我用來檢索XML數據的代碼:

        WebRequest request = WebRequest.Create("http://services.aonaware.com/DictService/DictService.asmx/DefineInDict");
        request.Method = "POST";
        string postData = "dictId=wn&word=abandon";
        byte[] byteArray = Encoding.UTF8.GetBytes(postData);
        request.ContentType = "application/x-www-form-urlencoded";
        request.ContentLength = byteArray.Length;
        Stream dataStream = request.GetRequestStream();
        dataStream.Write(byteArray, 0, byteArray.Length);
        dataStream.Close();
        WebResponse response = request.GetResponse();
        Console.WriteLine(((HttpWebResponse)response).StatusDescription);
        dataStream = response.GetResponseStream();
        StreamReader reader = new StreamReader(dataStream);
        string responseFromServer = reader.ReadToEnd();
        Console.WriteLine(responseFromServer);
        reader.Close();
        dataStream.Close();
        response.Close();

我想將XML中的數據提取到List類中的List:

public class Def
{
    public string text { get; set; }
    public List<string> synonym { get; set; }
}

public class Definition
{
    public string type { get; set; } // single character: n or v or a 
    public List<Def> Def { get; set; }
}

有人可以給我一些關於如何做到這一點的建議,並展示我可以選擇XML中的類元素並將它們放入類中

我認為這個問題對許多其他人有幫助我會開一大筆獎金,所以希望有人能抽出時間想出一個很好的例子

更新:

抱歉。 我用同義詞犯了一個錯誤。 我現在改變了。 希望它更有意義。 同義詞只是一個列表我也用粗體放入我需要的內容,因為到目前為止這兩個答案似乎根本沒有回答這個問題。 謝謝。

我為單詞定義創建了一個簡單的解析器(非常確定這里有改進的空間):

解決方案1.0

class ParseyMcParseface
{
    /// <summary>
    /// Word definition lines
    /// </summary>
    private string[] _text;

    /// <summary>
    /// Constructor (Takes the innerText of the WordDefinition tag as input
    /// </summary>
    /// <param name="text">innerText of the WordDefinition</param>
    public ParseyMcParseface(string text)
    {
        _text = text.Split(new [] {'\n'}, StringSplitOptions.RemoveEmptyEntries)
            .Skip(1) // Skip the first line where the word is mentioned
            .ToArray();
    }

    /// <summary>
    /// Convert from single letter type to full human readable type
    /// </summary>
    /// <param name="c"></param>
    /// <returns></returns>
    private string CharToType(char c)
    {
        switch (c)
        {
            case 'a':
                return "Adjective";
            case 'n':
                return "Noun";
            case 'v':
                return "Verb";
            default:
                return "Unknown";
        }
    }

    /// <summary>
    /// Reorganize the data for easier parsing
    /// </summary>
    /// <param name="text">Lines of text</param>
    /// <returns></returns>
    private static List<List<string>> MakeLists(IEnumerable<string> text)
    {
        List<List<string>> types = new List<List<string>>();
        int i = -1;
        int j = 0;
        foreach (var line in text)
        {
            // New type (Noun, Verb, Adj.)
            if (Regex.IsMatch(line.Trim(), "^[avn]{1}\\ \\d+"))
            {
                types.Add(new List<string> { line.Trim() });
                i++;
                j = 0;
            }
            // New definition in the previous type
            else if (Regex.IsMatch(line.Trim(), "^\\d+"))
            {
                j++;
                types[i].Add(line.Trim());
            }
            // New line of the same definition
            else
            {
                types[i][j] = types[i][j] + " " + line.Trim();
            }
        }

        return types;
    }

    public List<Definition> Parse()
    {
        var definitionsLines = MakeLists(_text);

        List<Definition> definitions = new List<Definition>();

        foreach (var type in definitionsLines)
        {

            var defs = new List<Def>();
            foreach (var def in type)
            {
                var match = Regex.Match(def.Trim(), "(?:\\:\\ )(\\w|\\ |;|\"|,|\\.|-)*[\\[]{0,1}");
                MatchCollection syns = Regex.Matches(def.Trim(), "\\{(\\w|\\ )+\\}");

                List<string> synonymes = new List<string>();
                foreach (Match syn in syns)
                {
                    synonymes.Add(syn.Value.Trim('{', '}'));
                }

                defs.Add(new Def()
                {
                    text = match.Value.Trim(':', '[', ' '),
                    synonym = synonymes
                });
            }


            definitions.Add(new Definition
            {
                type = CharToType(type[0][0]),
                Def = defs
            });
        }
        return definitions;
    }
}

這是一個用法示例:

WebRequest request = 
WebRequest.Create("http://services.aonaware.com/DictService/DictService.asmx/DefineInDict");
request.Method = "POST";
string postData = "dictId=wn&word=abandon";
byte[] byteArray = Encoding.UTF8.GetBytes(postData);
request.ContentType = "application/x-www-form-urlencoded";
request.ContentLength = byteArray.Length;
Stream dataStream = request.GetRequestStream();
dataStream.Write(byteArray, 0, byteArray.Length);
dataStream.Close();
WebResponse response = request.GetResponse();
Console.WriteLine(((HttpWebResponse)response).StatusDescription);
dataStream = response.GetResponseStream();
StreamReader reader = new StreamReader(dataStream);
string responseFromServer = reader.ReadToEnd();


var doc = new XmlDocument();
doc.LoadXml(responseFromServer );
var el = doc.GetElementsByTagName("WordDefinition");

ParseyMcParseface parseyMcParseface = new ParseyMcParseface(el[1].InnerText);
var parsingResult = parseyMcParseface.Parse();
// parsingResult will contain a list of Definitions
// per the format specified in the question.

這是一個現場演示: https//dotnetfiddle.net/24IQ67

您還可以通過添加對該Web服務的引用來避免手動檢索然后解析XML。

解決方案2.0

我做了一個小應用程序然后解析定義。 在GitHub上托管 (它太大了,無法在StackOverflow上發布):

public enum WordTypes
{
    Noun,
    Verb,
    Adjective,
    Adverb,
    Unknown
}

public class Definition
{
    public Definition()
    {
        Synonyms = new List<string>();
        Anotnyms = new List<string>();
    }
    public WordTypes WordType { get; set; }
    public string DefinitionText { get; set; }
    public List<string> Synonyms { get; set; }
    public List<string> Anotnyms { get; set; }

}

static class DefinitionParser
{
    public static List<Definition> Parse(string wordDefinition)
    {
        var wordDefinitionLines = wordDefinition.Split(new[] { '\n' }, StringSplitOptions.RemoveEmptyEntries)
            .Skip(1)
            .Select(x => x.Trim())
            .ToList();

        var flatenedList = MakeLists(wordDefinitionLines).SelectMany(x => x).ToList();

        var result = new List<Definition>();
        foreach (var wd in flatenedList)
        {
            var foundMatch = Regex.Match(wd, @"^(?<matchType>adv|adj|v|n){0,1}\s*(\d*): (?<definition>[\w\s;""',\.\(\)\!\-]+)(?<extraInfoSyns>\[syn: ((?<wordSyn>\{[\w\s\-]+\})|(?:[,\ ]))*\]){0,1}\s*(?<extraInfoAnts>\[ant: ((?<wordAnt>\{[\w\s-]+\})|(?:[,\ ]))*\]){0,1}");

            var def = new Definition();

            if (foundMatch.Groups["matchType"].Success)
            {
                var matchType = foundMatch.Groups["matchType"];
                def.WordType = DefinitionTypeToEnum(matchType.Value);
            }

            if (foundMatch.Groups["definition"].Success)
            {
                var definition = foundMatch.Groups["definition"];
                def.DefinitionText = definition.Value;
            }

            if (foundMatch.Groups["extraInfoSyns"].Success && foundMatch.Groups["wordSyn"].Success)
            {
                foreach (Capture capture in foundMatch.Groups["wordSyn"].Captures)
                {
                    def.Synonyms.Add(capture.Value.Trim('{','}'));
                }
            }

            if (foundMatch.Groups["extraInfoAnts"].Success && foundMatch.Groups["wordAnt"].Success)
            {
                foreach (Capture capture in foundMatch.Groups["wordAnt"].Captures)
                {
                    def.Anotnyms.Add(capture.Value.Trim('{', '}'));
                }
            }

            result.Add(def);
        }
        return result;
    }

    private static List<List<string>> MakeLists(IEnumerable<string> text)
    {
        List<List<string>> types = new List<List<string>>();
        int i = -1;
        int j = 0;
        foreach (var line in text)
        {
            // New type (Noun, Verb, Adj.)
            if (Regex.IsMatch(line, "^(adj|v|n|adv){1}\\s\\d*"))
            {
                types.Add(new List<string> { line });
                i++;
                j = 0;
            }
            // New definition in the previous type
            else if (Regex.IsMatch(line, "^\\d+"))
            {
                j++;
                types[i].Add(line);
            }
            // New line of the same definition
            else
            {
                types[i][j] = types[i][j] + " " + line;
            }
        }

        return types;
    }

    private static WordTypes DefinitionTypeToEnum(string input)
    {
        switch (input)
        {
            case "adj":
                return WordTypes.Adjective;
            case "adv":
                return WordTypes.Adverb;
            case "n":
                return WordTypes.Noun;
            case "v":
                return WordTypes.Verb;
            default:
                return WordTypes.Unknown;
        }
    }
}

在此輸入圖像描述

筆記:

  • 這應該按預期工作
  • 解析自由文本是不可靠的
  • 您應該導入服務引用(如另一個答案中所述),而不是手動解析XML。

Alexander Petrov的答案對你來說是完美的,除了你正在處理一個不穩定的xml架構。 如果WordNet是一個真正的裝備,他們應該重新設計架構以刪除嵌套的WordDefinition元素並為基本定義部分添加新元素。

這種快速解決方案適用於您提供的特定測試用例,但它依賴於對文本性質的許多假設。 它還使用字符串操作和正則表達式,這些表達式被認為是低效的,因此可能太慢並且容易出錯以滿足您的要求。 如果您將問題定制到字符串操作問題域,則可能會收到更好的此任務解決方案。 但正確的解決方案是獲得更好的xml架構。

using System;
using System.Collections.Generic;
using System.IO;
using System.Text.RegularExpressions;
using System.Xml;

namespace DefinitionTest
{
    class Program
    {
        static void Main(string[] args)
        {
            List<Definition> definitions = new List<Definition>();

            // The starting point after your web service call.
            string responseFromServer = EmulateWebService();

            // Load the string into this object in order to parse the xml.
            XmlDocument doc = new XmlDocument();
            doc.LoadXml(responseFromServer);

            XmlNode root = doc.DocumentElement.ParentNode;

            XmlNodeList elemList = doc.GetElementsByTagName("WordDefinition");
            for (int i = 0; i < elemList.Count; i++)
            {
                XmlNode def = elemList[i];

                // We only want WordDefinition elements that have just one child which is the content we need.
                // Any WordDefinition that has zero children or more than one child is either empty or a parent element.
                if (def.ChildNodes.Count == 1)
                {
                    Console.WriteLine(string.Format("Content of WordDefinition {0}", i));
                    Console.WriteLine();
                    Console.WriteLine(def.InnerXml);
                    Console.WriteLine();

                    definitions.Add(ParseWordDefinition(def.InnerXml));

                    foreach (Definition dd in definitions)
                    {
                        Console.WriteLine(string.Format("Parsed Word Definition for \"{0}\"", dd.wordDefined));
                        Console.WriteLine();
                        foreach (Def d in dd.Defs)
                        {
                            string type = string.Empty;
                            switch (d.type)
                            {
                                case "a":
                                    type = "Adjective";
                                    break;
                                case "n":
                                    type = "Noun";
                                    break;
                                case "v":
                                    type = "Verb";
                                    break;
                                default:
                                    type = "";
                                    break;
                            }
                            Console.WriteLine(string.Format("Type \"{0}\"", type));
                            Console.WriteLine();
                            Console.WriteLine(string.Format("\tDefinition \"{0}\"", d.text));
                            Console.WriteLine();
                            if (d.Synonym != null && d.Synonym.Count > 0)
                            {
                                Console.WriteLine("\tSynonyms");
                                foreach (string syn in d.Synonym)
                                    Console.WriteLine("\t\t" + syn);
                            }
                        }
                    }
                }
            }
        }

        static string EmulateWebService()
        {
            string result = string.Empty;

            // The "definition.xml"file is a copy of the test data you provided.
            using (StreamReader reader = new StreamReader(@"c:\projects\definitiontest\definitiontest\definition.xml"))
            {
                result = reader.ReadToEnd();
            }
            return result;
        }

        static Definition ParseWordDefinition(string xmlDef)
        {
            // Replace any carriage return/line feed characters with spaces.
            string oneLine = xmlDef.Replace(System.Environment.NewLine, " ");

            // Squeeze internal white space.
            string squeezedLine = Regex.Replace(oneLine, @"\s{2,}", " ");

            // Assumption 1: The first word in the string is always the word being defined.
            string[] wordAndDefs = squeezedLine.Split(new char[] { ' ' }, StringSplitOptions.None);
            string wordDefined = wordAndDefs[0];
            string allDefinitions = string.Join(" ", wordAndDefs, 1, wordAndDefs.Length - 1);

            Definition parsedDefinition = new Definition();
            parsedDefinition.wordDefined = wordDefined;
            parsedDefinition.Defs = new List<Def>();

            string type = string.Empty;

            // Assumption 2: All definitions are delimited by a type letter, a number and a ':' character.
            string[] subDefinitions = Regex.Split(allDefinitions, @"(n|v|a){0,1}\s\d{1,}:");
            foreach (string definitionPart in subDefinitions)
            {
                if (string.IsNullOrEmpty(definitionPart))
                    continue;

                if (definitionPart == "n" || definitionPart == "v" || definitionPart == "a")
                {
                    type = definitionPart;
                }
                else
                {
                    Def def = new Def();
                    def.type = type;

                    // Assumption 3. Synonyms always use the [syn: {..},... ] pattern.
                    string realDef = (Regex.Split(definitionPart, @"\[\s*syn:"))[0];
                    def.text = realDef;

                    MatchCollection syns = Regex.Matches(definitionPart, @"\{([a-zA-Z\s]{1,})\}");
                    if (syns.Count > 0)
                        def.Synonym = new List<string>();

                    foreach (Match match in syns)
                    {
                        string s = match.Groups[0].Value;
                        // A little problem with regex retaining braces, so
                        // remove them here.
                        def.Synonym.Add(s.Replace('{', ' ').Replace('}', ' ').Trim());
                        int y = 0;
                    }
                    parsedDefinition.Defs.Add(def);
                }
            }
            return parsedDefinition;
        }
    }

    public class Def
    {
        // Moved your type from Definition to Def, since it made more sense to me.
        public string type { get; set; } // single character: n or v or a 
        public string text { get; set; }
        // Changed your synonym definition here.
        public List<string> Synonym { get; set; }
    }

    public class Definition
    {
        public string wordDefined { get; set; }
        // Changed Def to Defs.
        public List<Def> Defs { get; set; }
    }
}

為什么手工? 讓我們自動完成所有事情,因為我們是程序員!

在項目上單擊鼠標右鍵,選擇“ 添加服務引用”
http://services.aonaware.com/DictService/DictService.asmx放入地址字段。
設置所需的命名空間。
您還可以通過單擊“高級”按鈕指定其他設置。
單擊確定按鈕。

將生成一組用於與服務一起工作的類。
然后只使用這些類。

請注意,在應用程序的App.config或Web.config中,將顯示使用該服務所需的設置。 接下來我們使用它們。

使用這些類的示例(不要忘記指定要使用的命名空間):

var client = new DictServiceSoapClient("DictServiceSoap");
var wordDefinition = client.DefineInDict("wn", "abandon");

就這樣!

DictServiceSoapClient構造函數中,我們從用於綁定的配置中指定名稱。

wordDefinition我們有一個請求結果。 讓我們從中獲取信息:

Console.WriteLine(wordDefinition.Word);
Console.WriteLine();

foreach (var definition in wordDefinition.Definitions)
{
    Console.WriteLine("Word: " + definition.Word);
    Console.WriteLine("Word Definition: " + definition.WordDefinition);

    Console.WriteLine("Id: " + definition.Dictionary.Id);
    Console.WriteLine("Name: " + definition.Dictionary.Name);
}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM