[英]Reading all lines in a file and splitting on multiple strings c#
I am attempting to read all files in a directory and write text to an external file depending on a specific string in the files contained in the directory.我试图读取目录中的所有文件并将文本写入外部文件,具体取决于目录中包含的文件中的特定字符串。
foreach (string line in File.ReadAllLines(pendingFile).Where(line => line.Split(';').Last().Contains("Test1")))
{
File.AppendAllText(path, line + Environment.NewLine);
}
How do I specify multiple strings here?如何在此处指定多个字符串? like so "Test1", "Test2", "Test3"?
像这样“Test1”,“Test2”,“Test3”?
foreach (string line in File.ReadAllLines(pendingFile).Where(line => line.Split(';').Last().Contains("Test1", "Test2", "Test3")))
You "do it the other way round";你“反过来做”; you don't ask "does this last bit of the line contain any of these strings", you ask "are any of these strings contained in the last bit of the line"
您不会问“该行的最后一位是否包含这些字符串中的任何一个”,而是问“该行的最后一位是否包含这些字符串中的任何一个”
var interestrings = new []{"Test1", "Test2", "Test3"};
File.ReadAllLines(pendingFile)
.Where(line =>
interestrings.Any(interestring =>
line.Split(';').Last().Contains(interestring)
)
)
It's probably worth pointing out your code would be a lot more readable if you didn't try and do it all in the for
header:可能值得指出的是,如果您不尝试在
for
标头中执行所有操作,您的代码将更具可读性:
var interestrings = new []{"Test1", "Test2", "Test3"};
foreach (string line in File.ReadAllLines(pendingFile))
{
var lastOne = line.Split(';').Last();
if(!interestrings.Any(interestring => lastOne.Contains(interestring))
continue;
File.AppendAllText(path, line + Environment.NewLine);
}
It won't perform significantly differently, because LINQ will (behind the scenes) be enumerating all the lines, but skipping those where the condition doesn't match and only giving you those that does - this loop essentially does the same thing without the chained enumeration它的表现不会有显着不同,因为 LINQ 将(在幕后)枚举所有行,但跳过那些条件不匹配的行,只给你那些匹配的行——这个循环本质上做同样的事情,没有链接枚举
You could get some useful performance boost by not using Split
(use a substring from the last index of ';'
) and also consider collecting your strings into a stringbuilder rather than repeatedly appending them to a file.通过不使用
Split
(使用';'
的最后一个索引中的子字符串),您可以获得一些有用的性能提升,并且还可以考虑将您的字符串收集到 stringbuilder 中,而不是重复将它们附加到文件中。 Also if you use File.ReadLines
rather than ReadAllLines
, you'll incrementally read the file rather than buffering it all into memory:此外,如果您使用
File.ReadLines
而不是ReadAllLines
,您将逐步读取文件而不是将其全部缓冲到内存中:
var sb = new StringBuilder(10000); //
var interestrings = new []{"Test1", "Test2", "Test3"};
foreach (string line in File.ReadLines(pendingFile))
{
var lastOne = line;
var idx = line.LastIndexOf(';');
if(idx == -1)
lastOne = line.Substring(idx);
if(!interestrings.Any(interestring => lastOne.Contains(interestring))
continue;
sb.AppendLine(line);
}
File.AppendAllText(path, sb.ToString());
If the file is huge, consider opening a stream and writing it line by line too, rather than buffering much of it into a stringbuilder如果文件很大,请考虑打开一个流并逐行写入,而不是将其中的大部分缓冲到 stringbuilder 中
use regular expression instead:改用正则表达式:
.Where(line => Regex.IsMatch(line, @"Test\d+$"))
(haven't tested this exact piece of code, just giving an idea) (没有测试过这段确切的代码,只是给出了一个想法)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.