简体   繁体   English

嵌套循环到IDataReader

[英]nested loops to IDataReader

I have a program that writes a huge DataTable (2.000.000 to 70.000.000 rows, depends on the configuration) to a database using a SqlBulkCopy . 我有一个程序,可以使用SqlBulkCopy将巨大的DataTable (2.000.000至70.000.000行,具体取决于配置)写入数据库。

I decided to change the loop that populates this table into a IDataReader , because the amount of rows often causes an OutOfMemoryException . 我决定将填充此表的循环更改为IDataReader ,因为行数通常会导致OutOfMemoryException

The table is populated like this 表格是这样填充的

// int[] firsts;
// string[] seconds;
// byte[] thirds;
var table = new DataTable();
foreach(var f in firsts)
{
    foreach(var s in seconds)
    {
        foreach(var t in thirds)
        {
            var row = table.NewRow();
            row[0] = f;
            row[1] = s;
            row[2] = t;
            table.Rows.Add(row);
        }
    }
    // here I also bulk load the table and clear it
}

so in my IDataReader class I will loop by index. 所以在我的IDataReader类中,我将按索引循环。 This is my attempt. 这是我的尝试。

class TableReader : IDataReader
{
    bool Eof = false;

    int FirstIndex;
    int SecondIndex;
    int ThirdIndex;

    //those are populated via constructor
    int[] firsts;
    string[] seconds;
    byte[] thirds;

    // this will be retrieved automatically via indexer
    object[] Values;

    public bool Read()
    {
        if(ThirdIndex != thirds.Length
            && SecondIndex < seconds.Length
            && FirstIndex < firsts.Length)
        {
            Values[0] = firsts[FirstIndex];
            Values[1] = seconds[SecondIndex];
            Values[2] = thirds[ThirdIndex++];
        }
        else if(SecondIndex != seconds.Length)
        {
            ThirdIndex  = 0;
            SecondIndex++;
        }
        else if(FirstIndex != firsts.Length)
        {
            SecondIndex = 0;
            FirstIndex++;
        }
        else
        {
            Eof = true;
        }
        return !Eof;
    }
}

I've created this code using a while(true) loop with a break instead of the Eof , but I can't seem to figure out how to do this. 我已经使用while(true)循环创建了这段代码,并使用了break而不是Eof ,但我似乎无法弄清楚该怎么做。

Anyone can help? 有人可以帮忙吗?

This is actually possible if you implement IDataReader and use the "yield return" keyword to provide rows. 如果实现IDataReader并使用“ yield return”关键字提供行,则实际上是可行的。 IDataReader is a bit of a pain to implement, but it isn't complex at all. IDataReader的实现有点麻烦,但是一点也不复杂。 The code below can be adapted to load a terabyte worth of data to the database and never run out of memory. 下面的代码可以调整为将价值TB的数据加载到数据库,而不会耗尽内存。

  • I replaced the DataRow objects with a single object array that is reused throughout the data read. 我用单个对象数组替换了DataRow对象,该对象数组在读取的整个数据中都可以重复使用。
  • Because there's no DataTable object to represent the columns, I had to do this myself by storing the data types and column names separately. 因为没有DataTable对象来表示列,所以我必须自己通过分别存储数据类型和列名称来做到这一点。

     class TestDataReader : IDataReader { int[] firsts = { 1, 2, 3, 4 }; string[] seconds = { "abc", "def", "ghi" }; byte[] thirds = { 0x30, 0x31, 0x32 }; // The data types of each column. Type[] dataTypes = { typeof(int), typeof(string), typeof(byte) }; // The names of each column. string[] names = { "firsts", "seconds", "thirds" }; // This function uses coroutines to turn the "push" approach into a "pull" approach. private IEnumerable<object[]> GetRows() { // Just re-use the same array. object[] row = new object[3]; foreach (var f in firsts) { foreach (var s in seconds) { foreach (var t in thirds) { row[0] = f; row[1] = s; row[2] = t; yield return row; } } // here I also bulk load he table and clear it } } // Everything below basically wraps this. IEnumerator<object[]> rowProvider; public TestDataReader() { rowProvider = GetRows().GetEnumerator(); } public object this[int i] { get { return GetValue(i); } } public object this[string name] { get { return GetValue(GetOrdinal(name)); } } public int Depth { get { return 0; } } public int FieldCount { get { return dataTypes.Length; } } public bool IsClosed { get { return false; } } public int RecordsAffected { get { return 0; } } // These don't really do anything. public void Close() { Dispose(); } public void Dispose() { rowProvider.Dispose(); } public string GetDataTypeName(int i) { return dataTypes[i].Name; } public Type GetFieldType(int i) { return dataTypes[i]; } // These functions get basic data types. public bool GetBoolean(int i) { return (bool) rowProvider.Current[i]; } public byte GetByte(int i) { return (byte) rowProvider.Current[i]; } public char GetChar(int i) { return (char) rowProvider.Current[i]; } public DateTime GetDateTime(int i) { return (DateTime) rowProvider.Current[i]; } public decimal GetDecimal(int i) { return (decimal) rowProvider.Current[i]; } public double GetDouble(int i) { return (double) rowProvider.Current[i]; } public float GetFloat(int i) { return (float) rowProvider.Current[i]; } public Guid GetGuid(int i) { return (Guid) rowProvider.Current[i]; } public short GetInt16(int i) { return (short) rowProvider.Current[i]; } public int GetInt32(int i) { return (int) rowProvider.Current[i]; } public long GetInt64(int i) { return (long) rowProvider.Current[i]; } public string GetString(int i) { return (string) rowProvider.Current[i]; } public object GetValue(int i) { return (object) rowProvider.Current[i]; } public string GetName(int i) { return names[i]; } public bool IsDBNull(int i) { object obj = rowProvider.Current[i]; return obj == null || obj is DBNull; } // Looks up a field number given its name. public int GetOrdinal(string name) { return Array.FindIndex(names, x => x.Equals(name, StringComparison.OrdinalIgnoreCase)); } // Populate "values" given the current row of data. public int GetValues(object[] values) { if (values == null) { return 0; } else { int len = Math.Min(values.Length, rowProvider.Current.Length); Array.Copy(rowProvider.Current, values, len); return len; } } // This reader only supports a single result set. public bool NextResult() { return false; } // Move to the next row. public bool Read() { return rowProvider.MoveNext(); } // Don't bother implementing these in any meaningful way. public long GetBytes(int i, long fieldOffset, byte[] buffer, int bufferoffset, int length) { throw new NotImplementedException(); } public long GetChars(int i, long fieldoffset, char[] buffer, int bufferoffset, int length) { throw new NotImplementedException(); } public IDataReader GetData(int i) { throw new NotImplementedException(); } public DataTable GetSchemaTable() { return null; } } 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM