簡體   English   中英

如何從XmlNode實例獲取xpath

[英]How to get xpath from an XmlNode instance

有人可以提供一些代碼來獲取System.Xml.XmlNode實例的xpath嗎?

謝謝!

好吧,我忍不住去了。 它只適用於屬性和元素,但是嘿......你能在15分鍾內得到什么:)同樣可能有一種更清潔的方式。

將索引包含在每個元素(特別是根元素!)上是多余的,但它比試圖弄清楚是否存在任何歧義更容易。

using System;
using System.Text;
using System.Xml;

class Test
{
    static void Main()
    {
        string xml = @"
<root>
  <foo />
  <foo>
     <bar attr='value'/>
     <bar other='va' />
  </foo>
  <foo><bar /></foo>
</root>";
        XmlDocument doc = new XmlDocument();
        doc.LoadXml(xml);
        XmlNode node = doc.SelectSingleNode("//@attr");
        Console.WriteLine(FindXPath(node));
        Console.WriteLine(doc.SelectSingleNode(FindXPath(node)) == node);
    }

    static string FindXPath(XmlNode node)
    {
        StringBuilder builder = new StringBuilder();
        while (node != null)
        {
            switch (node.NodeType)
            {
                case XmlNodeType.Attribute:
                    builder.Insert(0, "/@" + node.Name);
                    node = ((XmlAttribute) node).OwnerElement;
                    break;
                case XmlNodeType.Element:
                    int index = FindElementIndex((XmlElement) node);
                    builder.Insert(0, "/" + node.Name + "[" + index + "]");
                    node = node.ParentNode;
                    break;
                case XmlNodeType.Document:
                    return builder.ToString();
                default:
                    throw new ArgumentException("Only elements and attributes are supported");
            }
        }
        throw new ArgumentException("Node was not in a document");
    }

    static int FindElementIndex(XmlElement element)
    {
        XmlNode parentNode = element.ParentNode;
        if (parentNode is XmlDocument)
        {
            return 1;
        }
        XmlElement parent = (XmlElement) parentNode;
        int index = 1;
        foreach (XmlNode candidate in parent.ChildNodes)
        {
            if (candidate is XmlElement && candidate.Name == element.Name)
            {
                if (candidate == element)
                {
                    return index;
                }
                index++;
            }
        }
        throw new ArgumentException("Couldn't find element within parent");
    }
}

Jon是正確的,有任何數量的XPath表達式將在實例文檔中產生相同的節點。 構建明確產生特定節點的表達式的最簡單方法是使用謂詞中節點位置的節點測試鏈,例如:

/node()[0]/node()[2]/node()[6]/node()[1]/node()[2]

顯然,這個表達式不是使用元素名稱,但是如果你要做的就是在文檔中找到一個節點,那么你不需要它的名字。 它也不能用於查找屬性(因為屬性不是節點而沒有位置;您只能通過名稱找到它們),但它會找到所有其他節點類型。

要構建此表達式,您需要編寫一個返回節點在其父節點中的位置的方法,因為XmlNode不會將其作為屬性公開:

static int GetNodePosition(XmlNode child)
{
   for (int i=0; i<child.ParentNode.ChildNodes.Count; i++)
   {
       if (child.ParentNode.ChildNodes[i] == child)
       {
          // tricksy XPath, not starting its positions at 0 like a normal language
          return i + 1;
       }
   }
   throw new InvalidOperationException("Child node somehow not found in its parent's ChildNodes property.");
}

(使用LINQ可能有一種更優雅的方法,因為XmlNodeList實現了IEnumerable ,但我會按照我所知道的去做。)

然后你可以編寫一個這樣的遞歸方法:

static string GetXPathToNode(XmlNode node)
{
    if (node.NodeType == XmlNodeType.Attribute)
    {
        // attributes have an OwnerElement, not a ParentNode; also they have
        // to be matched by name, not found by position
        return String.Format(
            "{0}/@{1}",
            GetXPathToNode(((XmlAttribute)node).OwnerElement),
            node.Name
            );            
    }
    if (node.ParentNode == null)
    {
        // the only node with no parent is the root node, which has no path
        return "";
    }
    // the path to a node is the path to its parent, plus "/node()[n]", where 
    // n is its position among its siblings.
    return String.Format(
        "{0}/node()[{1}]",
        GetXPathToNode(node.ParentNode),
        GetNodePosition(node)
        );
}

正如您所看到的,我在某種程度上也破解了它以找到屬性。

在我寫作的時候,喬恩插入了他的版本。 關於他的代碼有一些東西會讓我現在有點吵了,如果聽起來我對Jon很討厭,我會提前道歉。 (我不是。我很確定Jon必須向我學習的內容非常簡短。)但我認為,對於任何使用XML的人來說,我要說的是非常重要的一點。想一想。

我懷疑Jon的解決方案來自我看到很多開發人員所做的事情:將XML文檔視為元素和屬性的樹。 我認為這主要來自於主要使用XML作為序列化格式的開發人員,因為他們習慣使用的所有XML都是以這種方式構建的。 您可以發現這些開發人員,因為他們可以互換地使用術語“節點”和“元素”。 這使他們想出了將所有其他節點類型視為特殊情況的解決方案。 (很長一段時間,我自己就是其中一個人。)

當你正在制作時,這感覺就像是一個簡化的假設。 但事實並非如此。 它使問題更難,代碼更復雜。 它引導您繞過XML技術(如XPath中的node()函數),這些技術專門用於一般地處理所有節點類型。

Jon的代碼中有一個紅旗,即使我不知道要求是什么,也會讓我在代碼審查中查詢它,那就是GetElementsByTagName 每當我看到使用該方法時,跳到腦海中的問題始終是“它為什么必須成為一個元素?” 答案經常是“哦,這段代碼是否也需要處理文本節點?”

我知道,舊帖子,但我最喜歡的版本(名稱有一個版本)存在缺陷:當父節點具有不同名稱的節點時,它會在找到第一個不匹配的節點名稱后停止計算索引。

這是我的固定版本:

/// <summary>
/// Gets the X-Path to a given Node
/// </summary>
/// <param name="node">The Node to get the X-Path from</param>
/// <returns>The X-Path of the Node</returns>
public string GetXPathToNode(XmlNode node)
{
    if (node.NodeType == XmlNodeType.Attribute)
    {
        // attributes have an OwnerElement, not a ParentNode; also they have             
        // to be matched by name, not found by position             
        return String.Format("{0}/@{1}", GetXPathToNode(((XmlAttribute)node).OwnerElement), node.Name);
    }
    if (node.ParentNode == null)
    {
        // the only node with no parent is the root node, which has no path
        return "";
    }

    // Get the Index
    int indexInParent = 1;
    XmlNode siblingNode = node.PreviousSibling;
    // Loop thru all Siblings
    while (siblingNode != null)
    {
        // Increase the Index if the Sibling has the same Name
        if (siblingNode.Name == node.Name)
        {
            indexInParent++;
        }
        siblingNode = siblingNode.PreviousSibling;
    }

    // the path to a node is the path to its parent, plus "/node()[n]", where n is its position among its siblings.         
    return String.Format("{0}/{1}[{2}]", GetXPathToNode(node.ParentNode), node.Name, indexInParent);
}

這是我用過的一個簡單的方法,為我工作。

    static string GetXpath(XmlNode node)
    {
        if (node.Name == "#document")
            return String.Empty;
        return GetXpath(node.SelectSingleNode("..")) + "/" +  (node.NodeType == XmlNodeType.Attribute ? "@":String.Empty) + node.Name;
    }

我的10p值是Robert和Corey的答案的混合體。 我只能聲稱額外的代碼行的實際輸入。

    private static string GetXPathToNode(XmlNode node)
    {
        if (node.NodeType == XmlNodeType.Attribute)
        {
            // attributes have an OwnerElement, not a ParentNode; also they have
            // to be matched by name, not found by position
            return String.Format(
                "{0}/@{1}",
                GetXPathToNode(((XmlAttribute)node).OwnerElement),
                node.Name
                );
        }
        if (node.ParentNode == null)
        {
            // the only node with no parent is the root node, which has no path
            return "";
        }
        //get the index
        int iIndex = 1;
        XmlNode xnIndex = node;
        while (xnIndex.PreviousSibling != null) { iIndex++; xnIndex = xnIndex.PreviousSibling; }
        // the path to a node is the path to its parent, plus "/node()[n]", where 
        // n is its position among its siblings.
        return String.Format(
            "{0}/node()[{1}]",
            GetXPathToNode(node.ParentNode),
            iIndex
            );
    }

沒有節點的“xpath”這樣的東西。 對於任何給定節點,可能有許多xpath表達式將匹配它。

您可以使用樹來構建一個與之匹配表達式,同時考慮特定元素的索引等,但它不會是非常好的代碼。

你為什么需要這個? 可能有更好的解決方案。

如果你這樣做,你會得到一個名為節點名稱的路徑和位置,如果你有這樣的節點:“/ Service [1] / System [1] / Group [1] / Folder [2 ] /文件[2]”

public string GetXPathToNode(XmlNode node)
{         
    if (node.NodeType == XmlNodeType.Attribute)
    {             
        // attributes have an OwnerElement, not a ParentNode; also they have             
        // to be matched by name, not found by position             
        return String.Format("{0}/@{1}", GetXPathToNode(((XmlAttribute)node).OwnerElement), node.Name);
    }
    if (node.ParentNode == null)
    {             
        // the only node with no parent is the root node, which has no path
        return "";
    }

    //get the index
    int iIndex = 1;
    XmlNode xnIndex = node;
    while (xnIndex.PreviousSibling != null && xnIndex.PreviousSibling.Name == xnIndex.Name)
    {
         iIndex++;
         xnIndex = xnIndex.PreviousSibling; 
    }

    // the path to a node is the path to its parent, plus "/node()[n]", where
    // n is its position among its siblings.         
    return String.Format("{0}/{1}[{2}]", GetXPathToNode(node.ParentNode), node.Name, iIndex);
}

使用類擴展怎么樣? ;)我的版本(建立在其他人的工作)使用語法名稱[索引] ...索引省略是元素沒有“兄弟”。 獲取元素索引的循環在獨立例程(也是類擴展)之外。

在任何實用程序類(或主程序類)中超過以下內容

static public int GetRank( this XmlNode node )
{
    // return 0 if unique, else return position 1...n in siblings with same name
    try
    {
        if( node is XmlElement ) 
        {
            int rank = 1;
            bool alone = true, found = false;

            foreach( XmlNode n in node.ParentNode.ChildNodes )
                if( n.Name == node.Name ) // sibling with same name
                {
                    if( n.Equals(node) )
                    {
                        if( ! alone ) return rank; // no need to continue
                        found = true;
                    }
                    else
                    {
                        if( found ) return rank; // no need to continue
                        alone = false;
                        rank++;
                    }
                }

        }
    }
    catch{}
    return 0;
}

static public string GetXPath( this XmlNode node )
{
    try
    {
        if( node is XmlAttribute )
            return String.Format( "{0}/@{1}", (node as XmlAttribute).OwnerElement.GetXPath(), node.Name );

        if( node is XmlText || node is XmlCDataSection )
            return node.ParentNode.GetXPath();

        if( node.ParentNode == null )   // the only node with no parent is the root node, which has no path
            return "";

        int rank = node.GetRank();
        if( rank == 0 ) return String.Format( "{0}/{1}",        node.ParentNode.GetXPath(), node.Name );
        else            return String.Format( "{0}/{1}[{2}]",   node.ParentNode.GetXPath(), node.Name, rank );
    }
    catch{}
    return "";
}   

我為Excel工作項目制作了VBA for Excel。 它輸出Xpath的元組和元素或屬性的相關文本。 目的是允許業務分析人員識別和映射一些xml。 感謝這是一個C#論壇,但認為這可能是有意義的。

Sub Parse2(oSh As Long, inode As IXMLDOMNode, Optional iXstring As String = "", Optional indexes)


Dim chnode As IXMLDOMNode
Dim attr As IXMLDOMAttribute
Dim oXString As String
Dim chld As Long
Dim idx As Variant
Dim addindex As Boolean
chld = 0
idx = 0
addindex = False


'determine the node type:
Select Case inode.NodeType

    Case NODE_ELEMENT
        If inode.ParentNode.NodeType = NODE_DOCUMENT Then 'This gets the root node name but ignores all the namespace attributes
            oXString = iXstring & "//" & fp(inode.nodename)
        Else

            'Need to deal with indexing. Where an element has siblings with the same nodeName,it needs to be indexed using [index], e.g swapstreams or schedules

            For Each chnode In inode.ParentNode.ChildNodes
                If chnode.NodeType = NODE_ELEMENT And chnode.nodename = inode.nodename Then chld = chld + 1
            Next chnode

            If chld > 1 Then '//inode has siblings of the same nodeName, so needs to be indexed
                'Lookup the index from the indexes array
                idx = getIndex(inode.nodename, indexes)
                addindex = True
            Else
            End If

            'build the XString
            oXString = iXstring & "/" & fp(inode.nodename)
            If addindex Then oXString = oXString & "[" & idx & "]"

            'If type is element then check for attributes
            For Each attr In inode.Attributes
                'If the element has attributes then extract the data pair XString + Element.Name, @Attribute.Name=Attribute.Value
                Call oSheet(oSh, oXString & "/@" & attr.Name, attr.Value)
            Next attr

        End If

    Case NODE_TEXT
        'build the XString
        oXString = iXstring
        Call oSheet(oSh, oXString, inode.NodeValue)

    Case NODE_ATTRIBUTE
    'Do nothing
    Case NODE_CDATA_SECTION
    'Do nothing
    Case NODE_COMMENT
    'Do nothing
    Case NODE_DOCUMENT
    'Do nothing
    Case NODE_DOCUMENT_FRAGMENT
    'Do nothing
    Case NODE_DOCUMENT_TYPE
    'Do nothing
    Case NODE_ENTITY
    'Do nothing
    Case NODE_ENTITY_REFERENCE
    'Do nothing
    Case NODE_INVALID
    'do nothing
    Case NODE_NOTATION
    'do nothing
    Case NODE_PROCESSING_INSTRUCTION
    'do nothing
End Select

'Now call Parser2 on each of inode's children.
If inode.HasChildNodes Then
    For Each chnode In inode.ChildNodes
        Call Parse2(oSh, chnode, oXString, indexes)
    Next chnode
Set chnode = Nothing
Else
End If

End Sub

使用以下方法管理元素計數:

Function getIndex(tag As Variant, indexes) As Variant
'Function to get the latest index for an xml tag from the indexes array
'indexes array is passed from one parser function to the next up and down the tree

Dim i As Integer
Dim n As Integer

If IsArrayEmpty(indexes) Then
    ReDim indexes(1, 0)
    indexes(0, 0) = "Tag"
    indexes(1, 0) = "Index"
Else
End If
For i = 0 To UBound(indexes, 2)
    If indexes(0, i) = tag Then
        'tag found, increment and return the index then exit
        'also destroy all recorded tag names BELOW that level
        indexes(1, i) = indexes(1, i) + 1
        getIndex = indexes(1, i)
        ReDim Preserve indexes(1, i) 'should keep all tags up to i but remove all below it
        Exit Function
    Else
    End If
Next i

'tag not found so add the tag with index 1 at the end of the array
n = UBound(indexes, 2)
ReDim Preserve indexes(1, n + 1)
indexes(0, n + 1) = tag
indexes(1, n + 1) = 1
getIndex = 1

End Function

您問題的另一個解決方案可能是“標記”您希望稍后使用自定義屬性識別的xmlnodes:

var id = _currentNode.OwnerDocument.CreateAttribute("some_id");
id.Value = Guid.NewGuid().ToString();
_currentNode.Attributes.Append(id);

例如,你可以存儲在字典中。 然后您可以使用xpath查詢識別節點:

newOrOldDocument.SelectSingleNode(string.Format("//*[contains(@some_id,'{0}')]", id));

我知道這不是你問題的直接答案,但是如果你想知道節點的xpath的原因是在你在代碼中丟失對它的引用之后有一種“到達”節點的方法,它會有所幫助。

這也克服了文檔添加/移動元素時的問題,這可能會弄亂xpath(或其他答案中建議的索引)。

我發現以上都沒有使用XDocument ,所以我編寫了自己的代碼來支持XDocument並使用了遞歸。 我認為這個代碼比其他代碼更好地處理多個相同的節點,因為它首先嘗試深入到XML路徑,然后備份以僅構建所需的代碼。 因此,如果你有/home/white/bob/home/white/mike並且想要創建/home/white/bob/garage ,代碼將知道如何創建它。 但是,我不想搞亂謂詞或通配符,所以我明確禁止那些; 但是添加對它們的支持會很容易。

Private Sub NodeItterate(XDoc As XElement, XPath As String)
    'get the deepest path
    Dim nodes As IEnumerable(Of XElement)

    nodes = XDoc.XPathSelectElements(XPath)

    'if it doesn't exist, try the next shallow path
    If nodes.Count = 0 Then
        NodeItterate(XDoc, XPath.Substring(0, XPath.LastIndexOf("/")))
        'by this time all the required parent elements will have been constructed
        Dim ParentPath As String = XPath.Substring(0, XPath.LastIndexOf("/"))
        Dim ParentNode As XElement = XDoc.XPathSelectElement(ParentPath)
        Dim NewElementName As String = XPath.Substring(XPath.LastIndexOf("/") + 1, XPath.Length - XPath.LastIndexOf("/") - 1)
        ParentNode.Add(New XElement(NewElementName))
    End If

    'if we find there are more than 1 elements at the deepest path we have access to, we can't proceed
    If nodes.Count > 1 Then
        Throw New ArgumentOutOfRangeException("There are too many paths that match your expression.")
    End If

    'if there is just one element, we can proceed
    If nodes.Count = 1 Then
        'just proceed
    End If

End Sub

Public Sub CreateXPath(ByVal XDoc As XElement, ByVal XPath As String)

    If XPath.Contains("//") Or XPath.Contains("*") Or XPath.Contains(".") Then
        Throw New ArgumentException("Can't create a path based on searches, wildcards, or relative paths.")
    End If

    If Regex.IsMatch(XPath, "\[\]()@='<>\|") Then
        Throw New ArgumentException("Can't create a path based on predicates.")
    End If

    'we will process this recursively.
    NodeItterate(XDoc, XPath)

End Sub

這更容易

 ''' <summary>
    ''' Gets the full XPath of a single node.
    ''' </summary>
    ''' <param name="node"></param>
    ''' <returns></returns>
    ''' <remarks></remarks>
    Private Function GetXPath(ByVal node As Xml.XmlNode) As String
        Dim temp As String
        Dim sibling As Xml.XmlNode
        Dim previousSiblings As Integer = 1

        'I dont want to know that it was a generic document
        If node.Name = "#document" Then Return ""

        'Prime it
        sibling = node.PreviousSibling
        'Perculate up getting the count of all of this node's sibling before it.
        While sibling IsNot Nothing
            'Only count if the sibling has the same name as this node
            If sibling.Name = node.Name Then
                previousSiblings += 1
            End If
            sibling = sibling.PreviousSibling
        End While

        'Mark this node's index, if it has one
        ' Also mark the index to 1 or the default if it does have a sibling just no previous.
        temp = node.Name + IIf(previousSiblings > 0 OrElse node.NextSibling IsNot Nothing, "[" + previousSiblings.ToString() + "]", "").ToString()

        If node.ParentNode IsNot Nothing Then
            Return GetXPath(node.ParentNode) + "/" + temp
        End If

        Return temp
    End Function
 public static string GetFullPath(this XmlNode node)
        {
            if (node.ParentNode == null)
            {
                return "";
            }
            else
            {
                return $"{GetFullPath(node.ParentNode)}\\{node.ParentNode.Name}";
            }
        }

我最近不得不這樣做。 只需要考慮因素。 這就是我想出的:

    private string GetPath(XmlElement el)
    {
        List<string> pathList = new List<string>();
        XmlNode node = el;
        while (node is XmlElement)
        {
            pathList.Add(node.Name);
            node = node.ParentNode;
        }
        pathList.Reverse();
        string[] nodeNames = pathList.ToArray();
        return String.Join("/", nodeNames);
    }

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM