I have the following algorithm ,
private void writetodb()
{
using(var reader = File.OpenRead("C:\Data.csv");
using(var parser = new TextFieldParser(reader))
{
//Do some opeartions
while(!parser.EndOfData)
{
//Do operations
//Take 500 rows of data and put it in dataset
Thread thread = new thread(() => WriteTodb(tablename, set));
thread.Start();
Thread.Sleep(5000);
}
}
}
public void WriteTodb(string table, CellSet set)
{
//WriteToDB
//Edit: This statement will write to hbase db in hdinsight
hbase.StoreCells(TableName, set);
}
This method works absolutely fine until 500 mb of data but after that it fails saying Out of memory exception
.
I am pretty much sure that it is because of threads but using threads is mandatory and I cant change the architecture.
Can anybody tell me what modifications I have to make in thread programming in the above program to avoid memory exception.
First of all, I can't understand your words about threading:
I have to make in thread programming in the above program to avoid memory exception.
You will use the thread programming if you use the TPL
, as it been already suggested. You really don't have to use the Thread
class if you can't understand it. You say that your code is C# 4.0
so the TPL
is an option for you. You can do you work something like this (very easy way):
List<Task> tasks = new List<Task>();
while(!parser.EndOfData)
{
tasks.Add(Task.Run(() => WriteTodb(tablename, set)));
}
Task.WaitAll(tasks.ToArray());
TPL engine will use the default TaskScheduler
class, which uses internal ThreadPool
and can level the resources you have on your server.
Also, I see that you're using the HBase
client from Microsoft, and it has async
method in it:
public async Task StoreCellsAsync(string table, CellSet cells)
{
}
So you can use the asynchronious approach in your code and TPL
at the same time :
List<Task> tasks = new List<Task>();
while(!parser.EndOfData)
{
tasks.Add(WriteTodb(tablename, set)));
}
// asynchroniously await all the writes
await Task.WhenAll(tasks.ToArray());
public async Task WriteTodb(string table,CellSet set)
{
//WriteToDB
//Edit: This statement will write to hbase db in hdinsight asynchroniously!
await hbase.StoreCellsAsync(TableName, set);
}
If, for some strange reasons, you can't use TPL
, you have to refactor your code and write your own thread scheduler:
Instead of creating new Thread everytime use ThreadPool.QueueUserWorkItem. For refrence see this: https://msdn.microsoft.com/en-us/library/kbf0f1ct(v=vs.110).aspx
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.