简体   繁体   中英

C# Multithreading Loop Datatable

I have a datatable with 1000 records . Each row has a column with a link.I will loop the datatable and fetch record from the website using the link in the datatable . The code is working fine , but this is taking too much time to retrieve the records. So I need to pass it in multiple threads and fetch records and add all the records to a single datatable . I an using C# , Visual studio 2015 .

How can we do using threading C# , Any help appreciated.

Existing code is as below.

 for (int i = 0; i < dt.Rows.Count; i++)
            {
                String years = String.Empty;


                dt.Rows[i]["Details"] = GetWebText(dt.Rows[i]["link"].ToString());

            }



    private String GetWebText(String url)
        {
            var html = new HtmlAgilityPack.HtmlDocument();
          string text= html.LoadHtml(new WebClient().DownloadString(url));

return text;
        }

You are going to run in to issues here with the thread-safety of write operations with data tables. So you need to ensure that the operations that you perform are separated nice.

The good thing is that you are actually doing three distinct steps and you can easily break them apart and parallelize the slow part while keeping it thread-safe.

Here's what your code is doing:

var url = dt.Rows[i]["link"].ToString();
var webText = GetWebText(url);
dt.Rows[i]["Details"] = webText;

Let's process the data in these three steps, but only parallize the GetWebText part.

This is how:

var data =
    dt
        .AsEnumerable()
        .Select(r => new { Row = r, Url = r["link"].ToString() })
        .AsParallel()
        // This `Select` is the only part run in parallel
        .Select(x => new { x.Row, WebText = GetWebText(x.Url) })
        .ToArray();

foreach (var datum in data)
{
    datum.Row["Details"] = datum.WebText;
}

Blocking Collections can solve the problem:

Blocking<string> links= new BlockingCollection<string>();\\ using System.Collections.Concurrent;
Blocking<string> results= new BlockingCollection<string>();

public static void main()
{
//get your datatable
       for (int i = 0; i < dt.Rows.Count; i++)
        {
        ThreadStart t = new ThreadStart(threads);
      Thread th = new Thread(t);
        th.Start();
        }
       for (int i = 0; i < dt.Rows.Count; i++)
        {
        links.add(dt.Rows[i]["link"].ToString());
        }
        for (int i = 0; i < dt.Rows.Count; i++)
        {
           dt.Rows[i]["Details"] = results.Take();
          }
}
public void threads()
{
while(true)
 {
  string url= Links.take();//block if links is empty
  var html = new HtmlAgilityPack.HtmlDocument();
  string text= html.LoadHtml(new WebClient().DownloadString(url));
  results.add(text);//add result to the other queue
 }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM