简体   繁体   中英

Powershell SQL Server Insert - Best Practice

I have a script that iterates through a few thousand files in a directory on a daily basis and would like to update a SQL Server table with details of each file, as they are processed within the foreach loop.

I have this working already using the following within the foreach loop:

Invoke-Sqlcmd -Query "INSERT INTO $dbTable (name, date, worknum, identifier) VALUES ('$name','$date','$worknum','$identifier')" 
              -ServerInstance $dbHost -Database $dbName -ErrorAction Stop

Although this works fine, I'd like to know if there would be any benefit to changing this method to establishing a SQL Server connection before the processing of the files starts and closing it at the end of the script? Something like this..

$SqlConnection = New-Object System.Data.SqlClient.SqlConnection
$SqlConnection.ConnectionString = "Server=$dbHost;Database=$dbName;Integrated Security=True;"

<foreach loop>

$SqlConnection.Close()

I'm not concerned with the speed that the script runs as its already pretty fast, just more with regard to not affecting DB performance.

As stated in the comments, you will need to test against your instance configuration and existing workload to determine if a solution is performant or not.

I had a similar experience with a PowerShell "app" that took a list of account identifiers and then INSERT them into a table for us to further process. The app was iterating over each ID and doing an INSERT for each ID originally. This was OK for most users, but occasionally someone would put 100k+ ids in and the performance on the app was horrid! (But the SQL server kept performing as expected) Using SqlBulkCopy speed that process up immensely for the client side with no discernible impact on the SQL server as well. (Only the folks with lots of records got the benefit though. There was no real change from <100 records though. )

The Write-DataTable and Out-DataTable are handy functions to have to make this easier.

My feelings out of the way, best practice....

Eugene Philipov has a good article on tests that they did on data load method performance between single value inserts, multi value inserts, and BulkCopy. They found that the number of columns you are inserting into has aa large affect on the operation's speed. The more columns, the less of a benefit you get from having multiple values in your insert or using bulk copy. However, using a single insert per record was always slower (by execution time).

Faster execution == less chance you will block/consume resources that are needed for your other workflows.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM