I have a very large comma delimited text file. Each field is, as stated, delimited by a comma and surrounded by quotes (all strings). The problem is that some of the fields contain a CR for multiple lines within that field. So when I do a ReadLine it stops at that CR. It would be nice if I can tell it to ONLY stop at CRLF combinations.
Does anyone have any snappy method to do this? The files can be very very large.
If you want specific ReadLine
, why not implement it?
public static class MyFileReader {
public static IEnumerable<String> ReadLineCRLF(String path) {
StringBuilder sb = new StringBuilder();
Char prior = '\0';
Char current = '\0';
using (StreamReader reader = new StreamReader(path)) {
int v = reader.Read();
if (v < 0) {
if (prior == '\r')
sb.Append(prior);
yield return sb.ToString();
yield break;
}
prior = current;
current = (Char) v;
if ((current == '\n') && (prior == '\r')) {
yield return sb.ToString();
sb.Clear();
}
else if (current == '\r') {
if (prior == '\r')
sb.Append(prior);
}
else
sb.Append(current);
}
}
}
Then use it
var lines = MyFileReader
.ReadLineCRLF(@"C:\MyData.txt");
How about using
string line = File.ReadAllText("input.txt"); // Read the text in one line
Then split it on carriage return/line feed like this:
var split = line.Split('\n'); // I'm not really sure it's \n you'll need, but it's something!
and then processing like by line in a loop
foreach(var line in split) { ... }
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.