简体繁体 English

C# 中的数据处理 - 最佳方法？

[英]Data Processing In C# - Best Approach?

原文 2013-04-24 11:11:05 2 2 c#/ sql-server

I am still on a learning curve in C# and SQL Server so please forgive my 'greeness'.我仍然在 C# 和 SQL Server 的学习曲线上，所以请原谅我的“绿色”。

Here is my scenario:这是我的场景：

I have an EMPLOYEE table with 10,000 rows.我有一个包含 10,000 行的 EMPLOYEE 表。 Each of these employees has transactions in a TRANSACTIONS table.这些员工中的每一个在 TRANSACTIONS 表中都有事务。

The transaction table has the salary elements like Basic pay, Acting allowance, Overtime hours etc. It also has payroll deductions like advance deductions, some loans (with interest), and savings (pension, social security savings etc.交易表有基本工资、代理津贴、加班时间等工资元素，也有工资扣除，如预扣、部分贷款（有利息）和储蓄（养老金、社保储蓄等）。

I need to go through each employee's transactions and compute taxes, outstanding balances on loans, update balances on savings, convert hours into payments/deductions and some other stuff.我需要检查每个员工的交易并计算税款、贷款未清余额、更新储蓄余额、将时间转换为付款/扣除额和其他一些东西。

This processing will give me a new set of rows for each employee, with a period marker (eg 2013-04 for April 2013).此处理将为每个员工提供一组新的行，并带有句点标记（例如 2013-04 代表 2013 年 4 月）。 I need to store this in a HISTORY table for future references.我需要将其存储在 HISTORY 表中以备将来参考。

What is the best approach for processing the entire 10,000 employee table and their transactions?处理整个 10,000 名员工表及其事务的最佳方法是什么？

I am told that pulling the entire table into memory via readers is not good practice and I agree.有人告诉我，通过阅读器将整个表格拉入内存并不是一个好习惯，我同意。

Do I keep pulling an employee from the database, process their transactions, and commit the history to the database?我是否不断从数据库中提取员工，处理他们的交易并将历史提交到数据库？ And pull the next and so forth?并拉下等等？

Too many calls to the back end?后台调用太多？

(EF not an option for me, still doing raw SQL in ADO.NET) （EF 不是我的选择，仍在 ADO.NET 中执行原始 SQL）

I will appreciate any help on this.我将不胜感激。

2 个解决方案

10000 rows is not much. 10000 行并不多。 Memory could easily handle that if there's not some enourmous varchar or binary columns.如果没有一些巨大的 varchar 或 binary 列，内存可以轻松处理。 Don't feel completely locked by good practice "rules".不要觉得完全被良好实践“规则”所束缚。

On the other hand, consider a stored procedure .另一方面，考虑一个存储过程。 Then all processing will be done locally on the server.然后所有处理都将在服务器本地完成。

edit: if neither of the above is an option, try to stream your results.编辑：如果以上都不是一个选项，请尝试流式传输您的结果。 For example, when reading your query save each row in a ConcurrentQueue or something like that.例如，在读取查询时将每一行保存在ConcurrentQueue或类似的东西中。 Before you execute the query, start another thread or a BackgroundWorker which checks the queue for new items and saves back results simultaneously on another SqlConnection .在执行查询之前，启动另一个线程或BackgroundWorker ，它会检查队列中是否有新项目并同时将结果保存在另一个SqlConnection 。 Work will be done when query is done AND the queue has Count 0.当查询完成且队列的Count 0 时，工作将完成。

Check out using ROW_NUMBER() .使用ROW_NUMBER()出。 This can be used by programs to allow large tables to be essentially browsed using 'x' number of rows at a time.程序可以使用它来允许一次使用“x”行数基本上浏览大型表。 You could then conceivably use this same method to batch your job over, say, 1000 rows at a time.然后，您可以想象使用相同的方法来批量处理您的作业，例如，一次 1000 行。

See this link for more information.有关更多信息，请参阅此链接。