I tried to split the file about 32GB using the below code but I got the memory exception
.
Please suggest me to split the file using C#
.
string[] splitFile = File.ReadAllLines(@"E:\\JKS\\ImportGenius\\0.txt");
int cycle = 1;
int splitSize = Convert.ToInt32(txtNoOfLines.Text);
var chunk = splitFile.Take(splitSize);
var rem = splitFile.Skip(splitSize);
while (chunk.Take(1).Count() > 0)
{
string filename = "file" + cycle.ToString() + ".txt";
using (StreamWriter sw = new StreamWriter(filename))
{
foreach (string line in chunk)
{
sw.WriteLine(line);
}
}
chunk = rem.Take(splitSize);
rem = rem.Skip(splitSize);
cycle++;
}
Well, to start with you need to use File.ReadLines
(assuming you're using .NET 4) so that it doesn't try to read the whole thing into memory. Then I'd just keep calling a method to spit the "next" however many lines to a new file:
int splitSize = Convert.ToInt32(txtNoOfLines.Text);
using (var lineIterator = File.ReadLines(...).GetEnumerator())
{
bool stillGoing = true;
for (int chunk = 0; stillGoing; chunk++)
{
stillGoing = WriteChunk(lineIterator, splitSize, chunk);
}
}
...
private static bool WriteChunk(IEnumerator<string> lineIterator,
int splitSize, int chunk)
{
using (var writer = File.CreateText("file " + chunk + ".txt"))
{
for (int i = 0; i < splitSize; i++)
{
if (!lineIterator.MoveNext())
{
return false;
}
writer.WriteLine(lineIterator.Current);
}
}
return true;
}
Do not read immediately all lines into an array, but use StremReader.ReadLine method, like:
using (StreamReader sr = new StreamReader(@"E:\\JKS\\ImportGenius\\0.txt"))
{
while (sr.Peek() >= 0)
{
var fileLine = sr.ReadLine();
//do something with line
}
}
Instead of reading all the file at once using File.ReadAllLines
, use File.ReadLines
in a foreach loop to read the lines as needed.
foreach (var line in File.ReadLines(@"E:\\JKS\\ImportGenius\\0.txt"))
{
// Do something
}
Edit: On an unrelated note, you don't have to escape your backslashes when prefixing the string with a '@'. So either write "E:\\\\JKS\\\\ImportGenius\\\\0.txt"
or @"E:\\JKS\\ImportGenius\\0.txt"
, but @"E:\\\\JKS\\\\ImportGenius\\\\0.txt"
is redundant.
File.ReadAllLines
That will read the whole file into memory .
To work with large files you need to only read what you need now into memory, and then throw that away as soon as you have finished with it.
A better option would be File.ReadLines
which returns a lazy enumerator, data is only read into memory as you get the next line from the enumerator. Providing you avoid multiple enumerations (eg. don't use Count()
) only parts of the file will be read.
The problem here is that you are reading the entire file's content into memory at once with File.ReadAllLines()
. What you need to do is open a FileStream with File.OpenRead()
and read/write smaller chunks.
Edit: Actually for your case ReadLine is obviously better. See other answers. :)
使用StreamReade r读取文件,使用StreamWriter写入。
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.