简体繁体 English

更好地查询数据库或将表数据加载到CLR程序集中的对象中？

[英]Better to query DB or to load table-data into object in CLR assembly?

原文 2014-03-25 17:53:13 5 1 c#/ sql-server-2008-r2/ coding-style/ .net-3.5/ clrstoredprocedure

I'm importing a flat-file into a DB via CLR assembly. 我正在通过CLR程序集将平面文件导入数据库。
For each row in the flat-file the assembly does a few quality checks. 对于平面文件中的每一行，程序集都会进行一些质量检查。 I've noticed that storing DB tables in DataTables and querying these DataTables is much slower than querying the DB directly. 我注意到，将数据库表存储在DataTables中并查询这些DataTables比直接查询DB慢得多。 HashSet on the other hand seem just as fast as querying the DB. 另一方面，HashSet看起来和查询数据库一样快。

At the moment, my code sometimes loads data into a HashSet and queries the HashSet for each row, at other times it checks the DB separately for each row. 此刻，我的代码有时会将数据加载到HashSet中，并为每一行查询HashSet，而在其他时候，它会分别针对每一行检查数据库。 For example when I check if a key in the source-row exists in the database for ~10 000 source-rows and ~1000 possible correct keys. 例如，当我检查源行中的键在数据库中是否存在约10000个源行和约1000个可能的正确键时。
HashSet : HashSet ：
+ I query the DB only once, and the assembly can perform its checks on the HashSet. +我只查询数据库一次，程序集可以在HashSet上执行其检查。
- Why replicate something that already exists in the DB? -为什么要复制数据库中已经存在的内容？
Query the DB : 查询数据库 ：
+ DB holds the structure of the table, and is optimized for these kinds of queries. + DB拥有表的结构，并针对此类查询进行了优化。
- I have to manage the DB-connection, which may include opening/closing the DB-connection multiple times. -我必须管理数据库连接，这可能包括多次打开/关闭数据库连接。

I want to standardize my code, and need help deciding which option to use? 我想使我的代码标准化，并且需要帮助来决定使用哪个选项？ I don't see a difference in performance for me. 我觉得我的表现没有差异。 If opening a DB-Connection from a CLR-assembly is a non-issue, then I would prefer to query the DB, since I can then just write SQL code into my CLR-assembly and execute it, rather than having to code multiple objects. 如果从CLR程序集打开数据库连接不是问题，那么我宁愿查询数据库，因为我可以将SQL代码写入CLR程序集并执行它，而不必编写多个对象。

Is there a technical reason to use one over the other? 是否有技术上的理由要使用另一个？
A coding-style recommendation? 编码风格的推荐？

Note : I am working with static data, so I don't need to worry about data-changes while the import-assembly is running. 注意：我正在处理静态数据，因此无需在导入程序集运行时担心数据更改。

1 个解决方案

I will vote for HashSet . 我将投票给HashSet 。 As there no significant difference in DB vs HashSet. 由于DB与HashSet没有显着差异。 Once you have everything with you in system you don't need to worry about DB calls connections and all. 一旦在系统中拥有了所有东西，您就不必担心数据库调用连接等等。

- Why replicate something that already exists in the DB? -为什么要复制数据库中已经存在的内容？

This replication will free you from hitting DB time and again, you can play around data with in the system. 这种复制将使您免于一次又一次地访问数据库，您可以在系统中处理数据。