简体   繁体   中英

Long running program and data processing

I'm trying to develop a long running data feeding program in C#. The data processing will run on more than one server and there will be like a kind of long running queue where data will be fed into the data processing software once the processing of the previous document is finished.

How should I do this? It will probably process 10,000 documents spread out into like 5 individual servers and the documents will be distributed using whether the server is busy and how many documents the server processed already. The usual load balancing criteria. I can write a Windows application that will continually monitor the servers and feed/send the data accordingly. Should I use WCF service? windows form application? Windows Message Queuing or should I build my own message queuing? The data feeding will basically go on continually, because each individual document might take 20-30 minutes and there are 10 thousand documents. I don't want the feeding program to crash/stop on basically any circumstance.

Which software and approach would you use to develop this? Any pointers, guesses and ideas?

You should use Windows Services (a kind fo Windows Application in Visual Studio):

Walkthrough: Creating a Windows Service Application in the Component Designer

If you configure them in the Control Panel they will start whenever the operating system starts, even if no user logs in. It's also necesarry that you start the service using an account which have privileges to access the files which have to read/write. You can create an account specifically for this.

Windows service (this explains how to do the thing of the previous paragraph)

What it's absolutely necessary is that you catch all exceptions, as a simple unhandled exception will stop the service abruptly.

You can use several ways to queue the jobs, but I recommend you to use something as easy as a database table. This table can have the columns necessary to define difine the job, and also a column to flag the process as pending/started/succesfullyFinished/finishedWithEroor and all the extra information you fin relevant (start and end time, processing computer, and so on). You should protect with a transaction the operations of getting the next pending job and setting it as started by a server to avoid conflicts between servers.

You shouldn't make a complicated load balancing procedure. if you really need to, you can set the thread priority to a low level so that it doesn't slow down other processes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM