简体   繁体   中英

How can I use more than 2100 values in an IN clause using Dapper?

I have a List containing ids that I want to insert into a temp table using Dapper in order to avoid the SQL limit on parameters in the 'IN' clause.

So currently my code looks like this:

public IList<int> LoadAnimalTypeIdsFromAnimalIds(IList<int> animalIds)
{
    using (var db = new SqlConnection(this.connectionString))
    {
        return db.Query<int>(
            @"SELECT a.animalID        
            FROM
            dbo.animalTypes [at]
            INNER JOIN animals [a] on a.animalTypeId = at.animalTypeId
            INNER JOIN edibleAnimals e on e.animalID = a.animalID
            WHERE
            at.animalId in @animalIds", new { animalIds }).ToList();
    }
}

The problem I need to solve is that when there are more than 2100 ids in the animalIds list then I get a SQL error "The incoming request has too many parameters. The server supports a maximum of 2100 parameters".

So now I would like to create a temp table populated with the animalIds passed into the method. Then I can join the animals table on the temp table and avoid having a huge "IN" clause.

I have tried various combinations of syntax but not got anywhere. This is where I am now:

public IList<int> LoadAnimalTypeIdsFromAnimalIds(IList<int> animalIds)
{
    using (var db = new SqlConnection(this.connectionString))
    {
        db.Execute(@"SELECT INTO #tempAnmialIds @animalIds");

        return db.Query<int>(
            @"SELECT a.animalID        
            FROM
            dbo.animalTypes [at]
            INNER JOIN animals [a] on a.animalTypeId = at.animalTypeId
            INNER JOIN edibleAnimals e on e.animalID = a.animalID
            INNER JOIN #tempAnmialIds tmp on tmp.animalID = a.animalID).ToList();
    }
}

I can't get the SELECT INTO working with the list of IDs. Am I going about this the wrong way maybe there is a better way to avoid the "IN" clause limit.

I do have a backup solution in that I can split the incoming list of animalIDs into blocks of 1000 but I've read that the large "IN" clause sufferes a performance hit and joining a temp table will be more efficient and it also means I don;t need extra 'splitting' code to batch up the ids in to blocks of 1000.

Ok, here's the version you want. I'm adding this as a separate answer, as my first answer using SP/TVP utilizes a different concept.

public IList<int> LoadAnimalTypeIdsFromAnimalIds(IList<int> animalIds)
{
  using (var db = new SqlConnection(this.connectionString))
  {
    // This Open() call is vital! If you don't open the connection, Dapper will
    // open/close it automagically, which means that you'll loose the created
    // temp table directly after the statement completes.
    db.Open();

    // This temp table is created having a primary key. So make sure you don't pass
    // any duplicate IDs
    db.Execute("CREATE TABLE #tempAnimalIds(animalId int not null primary key);");
    while (animalIds.Any())
    {
      // Build the statements to insert the Ids. For this, we need to split animalIDs
      // into chunks of 1000, as this flavour of INSERT INTO is limited to 1000 values
      // at a time.
      var ids2Insert = animalIds.Take(1000);
      animalIds = animalIds.Skip(1000).ToList();

      StringBuilder stmt = new StringBuilder("INSERT INTO #tempAnimalIds VALUES (");
      stmt.Append(string.Join("),(", ids2Insert));
      stmt.Append(");");

      db.Execute(stmt.ToString());
    }

    return db.Query<int>(@"SELECT animalID FROM #tempAnimalIds").ToList();
  }
}

To test:

var ids = LoadAnimalTypeIdsFromAnimalIds(Enumerable.Range(1, 2500).ToList());

You just need to amend your select statement to what it originally was. As I don't have all your tables in my environment, I just selected from the created temp table to prove it works the way it should.

Pitfalls, see comments:

  • Open the connection at the beginning, otherwise the temp table will be gone after dapper automatically closes the connection right after creating the table.
  • This particular flavour ofINSERT INTO is limited to 1000 values at a time, so the passed IDs need to be split into chunks accordingly.
  • Don't pass duplicate keys, as the primary key on the temp table will not allow that.

Edit

It seems Dapper supports a set-based operation which will make this work too:

public IList<int> LoadAnimalTypeIdsFromAnimalIdsV2(IList<int> animalIds)
{
  // This creates an IEnumerable of an anonymous type containing an Id property. This seems
  // to be necessary to be able to grab the Id by it's name via Dapper.
  var namedIDs = animalIds.Select(i => new {Id = i});
  using (var db = new SqlConnection(this.connectionString))
  {
    // This is vital! If you don't open the connection, Dapper will open/close it
    // automagically, which means that you'll loose the created temp table directly
    // after the statement completes.
    db.Open();

    // This temp table is created having a primary key. So make sure you don't pass
    // any duplicate IDs
    db.Execute("CREATE TABLE #tempAnimalIds(animalId int not null primary key);");

    // Using one of Dapper's convenient features, the INSERT becomes:
    db.Execute("INSERT INTO #tempAnimalIds VALUES(@Id);", namedIDs);

    return db.Query<int>(@"SELECT animalID FROM #tempAnimalIds").ToList();
  }
}

I don't know how well this will perform compared to the previous version (ie. 2500 single inserts instead of three inserts with 1000, 1000, 500 values each). But the doc suggests that it performs better if used together with async, MARS and Pipelining.

In your example, what I can't see is how your list of animalIds is actually passed to the query to be inserted into the #tempAnimalIDs table.

There is a way to do it without using a temp table, utilizing a stored procedure with a table value parameter.

SQL:

CREATE TYPE [dbo].[udtKeys] AS TABLE([i] [int] NOT NULL)
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[myProc](@data as dbo.udtKeys readonly)AS
BEGIN
    select i from @data;
END
GO

This will create a user defined table type called udtKeys which contains just one int column named i , and a stored procedure that expects a parameter of that type. The proc does nothing else but to select the IDs you passed, but you can of course join other tables to it. For a hint regarding the syntax, see here .

C#:

var dataTable = new DataTable();
dataTable.Columns.Add("i", typeof(int));
foreach (var animalId in animalIds)
    dataTable.Rows.Add(animalId);
using(SqlConnection conn = new SqlConnection("connectionString goes here"))
{
    var r=conn.Query("myProc", new {data=dataTable},commandType: CommandType.StoredProcedure);
    // r contains your results
}

The parameter within the procedure gets populated by passing a DataTable, and that DataTable's structure must match the one of the table type you created.

If you really need to pass more that 2100 values, you may want to consider indexing your table type to increase performance. You can actually give it a primary key if you don't pass any duplicate keys, like this:

CREATE TYPE [dbo].[udtKeys] AS TABLE(
    [i] [int] NOT NULL,
    PRIMARY KEY CLUSTERED 
    (
        [i] ASC
    )WITH (IGNORE_DUP_KEY = OFF)
)
GO

You may also need to assign execute permissions for the type to the database user you execute this with, like so:

GRANT EXEC ON TYPE::[dbo].[udtKeys] TO [User]
GO

See also here and here .

For me, the best way I was able to come up with was turning the list into a comma separated list in C# then using string_split in SQL to insert the data into a temp table. There are probably upper limits to this, but in my case I was only dealing with 6,000 records and it worked really fast.

public IList<int> LoadAnimalTypeIdsFromAnimalIds(IList<int> animalIds)
{
    using (var db = new SqlConnection(this.connectionString))
    {
        return db.Query<int>(
            @"  --Created a temp table to join to later. An index on this would probably be good too.
                CREATE TABLE #tempAnimals (Id INT)
                INSERT INTO #tempAnimals (ID)
                SELECT value FROM string_split(@animalIdStrings)

                SELECT at.animalTypeID        
                FROM dbo.animalTypes [at]
                JOIN animals [a] ON a.animalTypeId = at.animalTypeId
                JOIN #tempAnimals temp ON temp.ID = a.animalID -- <-- added this
                JOIN edibleAnimals e ON e.animalID = a.animalID", 
            new { animalIdStrings = string.Join(",", animalIds) }).ToList();
    }
}

It might be worth noting that string_split is only available in SQL Server 2016 or higher or if using Azure SQL then compatibility mode 130 or higher. https://docs.microsoft.com/en-us/sql/t-sql/functions/string-split-transact-sql?view=sql-server-ver15

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM