简体   繁体   English

在 VB.net 中使用 XMLReader 读取大型 XML 文件

[英]Reading large XML file using XMLReader in VB.net

I have a XML file size 35 GB.我有一个 35 GB 的 XML 文件。 I tried to load this file using xmldocument and got out of memory exception.我尝试使用 xmldocument 加载此文件并出现内存不足异常。 So, using xmlreader to parse the xml data to load it to database.因此,使用 xmlreader 解析 xml 数据以将其加载到数据库。 But, I am not able to read the child nodes within a parent node.但是,我无法读取父节点中的子节点。

Example XML file content: XML 文件内容示例:

File name : wcproduction.xml文件名:wcproduction.xml

<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<xsd:schema targetNamespace="urn:schemas-microsoft-com:sql:SqlRowSet1" xmlns:schema="urn:schemas-microsoft-com:sql:SqlRowSet1" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:sqltypes="http://schemas.microsoft.com/sqlserver/2004/sqltypes" elementFormDefault="qualified">
    <xsd:import namespace="http://schemas.microsoft.com/sqlserver/2004/sqltypes" schemaLocation="http://schemas.microsoft.com/sqlserver/2004/sqltypes/sqltypes.xsd"/>
    <xsd:element name="wcproduction">
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element name="api_st_cde" type="sqltypes:smallint" nillable="1"/>
                <xsd:element name="api_cnty_cde" type="sqltypes:smallint" nillable="1"/>
                <xsd:element name="api_well_idn" type="sqltypes:int" nillable="1"/>
                <xsd:element name="pool_idn" type="sqltypes:int" nillable="1"/>
                <xsd:element name="prodn_mth" type="sqltypes:smallint" nillable="1"/>
                <xsd:element name="prodn_yr" type="sqltypes:int" nillable="1"/>
                <xsd:element name="ogrid_cde" type="sqltypes:int" nillable="1"/>
                <xsd:element name="prd_knd_cde" nillable="1">
                    <xsd:simpleType>
                        <xsd:restriction base="sqltypes:char" sqltypes:localeId="1033" sqltypes:sqlCompareOptions="IgnoreCase IgnoreKanaType IgnoreWidth" sqltypes:sqlSortId="52">
                            <xsd:maxLength value="2"/>
                        </xsd:restriction>
                    </xsd:simpleType>
                </xsd:element>
                <xsd:element name="eff_dte" type="sqltypes:datetime" nillable="1"/>
                <xsd:element name="amend_ind" nillable="1">
                    <xsd:simpleType>
                        <xsd:restriction base="sqltypes:char" sqltypes:localeId="1033" sqltypes:sqlCompareOptions="IgnoreCase IgnoreKanaType IgnoreWidth" sqltypes:sqlSortId="52">
                            <xsd:maxLength value="1"/>
                        </xsd:restriction>
                    </xsd:simpleType>
                </xsd:element>
                <xsd:element name="c115_wc_stat_cde" nillable="1">
                    <xsd:simpleType>
                        <xsd:restriction base="sqltypes:char" sqltypes:localeId="1033" sqltypes:sqlCompareOptions="IgnoreCase IgnoreKanaType IgnoreWidth" sqltypes:sqlSortId="52">
                            <xsd:maxLength value="1"/>
                        </xsd:restriction>
                    </xsd:simpleType>
                </xsd:element>
                <xsd:element name="prod_amt" type="sqltypes:int" nillable="1"/>
                <xsd:element name="prodn_day_num" type="sqltypes:smallint" nillable="1"/>
                <xsd:element name="mod_dte" type="sqltypes:datetime" nillable="1"/>
            </xsd:sequence>
        </xsd:complexType>
    </xsd:element>
</xsd:schema>
<wcproduction xmlns="urn:schemas-microsoft-com:sql:SqlRowSet1">
    <api_st_cde>30</api_st_cde>
    <api_cnty_cde>5</api_cnty_cde>
    <api_well_idn>20178</api_well_idn>
    <pool_idn>10540</pool_idn>
    <prodn_mth>7</prodn_mth>
    <prodn_yr>1973</prodn_yr>
    <ogrid_cde>12437</ogrid_cde>
    <prd_knd_cde>G </prd_knd_cde>
    <eff_dte>1973-07-31T00:00:00</eff_dte>
    <amend_ind>N</amend_ind>
    <c115_wc_stat_cde>F</c115_wc_stat_cde>
    <prod_amt>53612</prod_amt>
    <prodn_day_num>99</prodn_day_num>
    <mod_dte>2015-04-07T07:31:00.173</mod_dte>
</wcproduction>
</root>

VB.net code that is try to read parent node wcproduction and its child nodes ( api_st_cde, api_cnty_cde, ...)试图读取父节点 wcproduction 及其子节点( api_st_cde, api_cnty_cde, ...)的 VB.net 代码

Dim settings As XmlReaderSettings = New XmlReaderSettings()
        settings.IgnoreWhitespace = True

        Using reader As XmlReader = XmlReader.Create("D:\\wcproduction.xml", settings)

            reader.ReadToFollowing("wcproduction")
            Do

                Dim inner As XmlReader = reader.ReadSubtree()
                Dim str As String = ""

                inner.ReadToDescendant("api_st_cde")
                str = inner.ReadInnerXml
                inner.ReadToDescendant("api_cnty_cde")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("api_well_idn")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("pool_idn")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("prodn_mth")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("prodn_yr")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("ogrid_cde")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("prd_knd_cde")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("eff_dte")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("amend_ind")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("c115_wc_stat_cde")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("prod_amt")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("prodn_day_num")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("mod_dte")
                str = str & ", " & inner.ReadInnerXml
                MsgBox(str)
                inner.Close()
            Loop While (reader.ReadToNextSibling("wcproduction"))

        End Using

I want to read and upload all nodes (wcproduction) and its child nodes to SQL server.我想读取所有节点(wcproduction)及其子节点并将其上传到 SQL 服务器。

Try following code which uses combination of XmlReader and xml linq尝试以下使用 XmlReader 和 xml linq 组合的代码

Imports System.Xml
Imports System.Xml.Linq
Module Module1
    Const FILENAME As String = "c:\temp\test.xml"
    Sub Main()
        Dim wcProdcutions As New List(Of WCProduction)
        Dim reader As XmlReader = XmlReader.Create(FILENAME)
        While (Not reader.EOF)
            If reader.Name <> "wcproduction" Then
                reader.ReadToFollowing("wcproduction")
            End If
            If Not reader.EOF Then
                Dim xWcproduction As XElement = XElement.ReadFrom(reader)
                Dim ns As XNamespace = xWcproduction.GetDefaultNamespace()
                Dim wcproduction As New WCProduction
                wcProdcutions.Add(wcproduction)

                wcproduction.api_st_cde = CType(xWcproduction.Element(ns + "api_st_cde"), Integer)
                wcproduction.api_well_idn = CType(xWcproduction.Element(ns + "api_well_idn"), Integer)
                wcproduction.pool_idn = CType(xWcproduction.Element(ns + "pool_idn"), Integer)
                wcproduction.prodn_mth = CType(xWcproduction.Element(ns + "prodn_mth"), Integer)
                wcproduction.prodn_yr = CType(xWcproduction.Element(ns + "prodn_yr"), Integer)
                wcproduction.ogrid_cde = CType(xWcproduction.Element(ns + "ogrid_cde"), Integer)
                wcproduction.prd_knd_cde = CType(xWcproduction.Element(ns + "prd_knd_cde"), String)
                wcproduction.eff_dte = CType(xWcproduction.Element(ns + "eff_dte"), DateTime)
                wcproduction.amend_ind = CType(xWcproduction.Element(ns + "amend_ind"), String)
                wcproduction.c115_wc_stat_cde = CType(xWcproduction.Element(ns + "c115_wc_stat_cde"), String)
                wcproduction.prod_amt = CType(xWcproduction.Element(ns + "prod_amt"), Integer)
                wcproduction.prodn_day_num = CType(xWcproduction.Element(ns + "prodn_day_num"), Integer)
                wcproduction.mod_dte = CType(xWcproduction.Element(ns + "mod_dte"), DateTime)

            End If

        End While
    End Sub

End Module
Public Class WCProduction
    Public api_st_cde As Integer
    Public api_cnty_cde As Integer
    Public api_well_idn As Integer
    Public pool_idn As Integer
    Public prodn_mth As Integer
    Public prodn_yr As Integer
    Public ogrid_cde As Integer
    Public prd_knd_cde As String
    Public eff_dte As DateTime
    Public amend_ind As String
    Public c115_wc_stat_cde As String
    Public prod_amt As Integer
    Public prodn_day_num As Integer
    Public mod_dte As DateTime
End Class

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM