简体   繁体   English

有没有办法忽略文本文件中某些行的阅读?

[英]is there any way to ignore reading in certain lines in a text file?

I'm trying to read in a text file in ac# application, but I don't want to read the first two lines, or the last line. 我正在尝试在ac#应用程序中读取文本文件,但我不想读取前两行或最后一行。 There's 8 lines in the file, so effectivly I just want to read in lines, 3, 4, 5, 6 and 7. Is there any way to do this? 文件中有8行,所以有效地我只想阅读第3,4,5,6和7行。有什么方法可以做到这一点吗?

example file 示例文件

USE [Shelley's Other Database] 使用[雪莱的其他数据库]
CREATE TABLE db.exmpcustomers( CREATE TABLE db.exmpcustomers(
fName varchar(100) NULL, fName varchar(100)NULL,
lName varchar(100) NULL, lName varchar(100)NULL,
dateOfBirth date NULL, dateOfBirth日期为NULL,
houseNumber int NULL, houseNumber int NULL,
streetName varchar(100) NULL streetName varchar(100)NULL
) ON [PRIMARY] )[主要]

EDIT 编辑

Okay, so, I've implemented Callum Rogers answer into my code and for some reason it works with my edited text file (I created a text file with the lines I didn't want to use omitted) and it does exactly what it should, but whenever I try it with the original text file (above) it throws an exception. 好吧,所以,我已经将Callum Rogers的答案应用到我的代码中,并且出于某种原因,它适用于我编辑的文本文件(我创建了一个文本文件,其中包含我不想省略的行)并且它完全符合它的要求,但每当我使用原始文本文件(上面)尝试它时,它会抛出异常。 I display this information in a DataGrid and I think that's where the exception is being thrown. 我在DataGrid中显示此信息,我认为这是抛出异常的地方。

Any ideas? 有任何想法吗?

The Answer by Rogers is good, I am just providing another way of doing this. 罗杰斯的答案很好,我只是提供了另一种方法。 Try this, 尝试这个,

List<string> list = new List<string>();
using (StreamReader reader = new StreamReader(FilePath))
{
    string text = "";
    while ((text = reader.ReadLine()) != null)
    {
        list.Add(text);
    }
    list.RemoveAt(0);
    list.RemoveAt(0);
 }

Hope this helps 希望这可以帮助

Why not just use File.ReadAllLines() and then remove the first 2 lines and the last line? 为什么不使用File.ReadAllLines()然后删除前两行和最后一行? With such a small file speed differences will not be noticeable. 如此小的文件速度差异将不会明显。

string[] allLines = File.ReadAllLines("file.ext");
string[] linesWanted = new string[allLines.Length-3];
Array.Copy(allLines, 2, linesWanted, 0, allLines.Length-3);

Why do you want to ignore exactly the first two and the last line? 为什么要完全忽略前两行和最后一行?

Depending on what your file looks like you might want to analyze the line, eg look at the first character whether it is a comment sign, or ignore everything until you find the first empty line, etc. 根据您的文件的样子,您可能想要分析该行,例如查看第一个字符是否是注释符号,或忽略所有内容,直到找到第一个空行等。

Sometimes, hardcoding "magic" numbers isn't such a good idea. 有时,硬编码“魔术”数字并不是一个好主意。 What if the file format needs to be changed to contain 3 header lines? 如果需要将文件格式更改为包含3个标题行怎么办?

As the other answers demonstrate: Nothing keeps you from doing what you ever want with a line you have read, so of course, you can ignore it, too. 正如其他答案所示:没有什么可以阻止你用你读过的一行做你想做的事情,所以当然,你也可以忽略它。

Edit, now that you've provided an example of your file: For your case I'd definitely not use the hardcoded numbers approach. 编辑,现在您已经提供了一个文件示例:对于您的情况,我绝对不会使用硬编码方法。 What if some day the SQL statement should contain another field, or if it appears on one instead of 8 lines? 如果有一天SQL语句应该包含另一个字段,或者它是否出现在一行而不是8行,该怎么办?

My suggestion: Read in the whole string at once, then analyze it. 我的建议:立即读入整个字符串,然后分析它。 Safest way would be to use a grammar , but if you presume the SQL statement is never going to be more complicated, you can use a regular expression (still much better than using line numbers etc.): 最安全的方法是使用语法 ,但如果你假设SQL语句永远不会更复杂,你可以使用正则表达式(仍然比使用行号更好):

string content = File.ReadAllText(filename);
Regex r = new Regex(@"CREATE TABLE [^\(]+\((.*)\) ON");
string whatYouWant = r.Match(content).Groups[0].Value;

If you have a TextReader object wrapping the filestream you could just call ReadLine() two times. 如果你有一个包装文件流的TextReader对象,你可以调用ReadLine()两次。

StreamReader inherits from TextReader, which is abstract. StreamReader继承自TextReader,它是抽象的。

Non-fool proof example: 非万能示例:

using (var fs = new FileStream("blah", FileMode.Open))
using (var reader = new StreamReader(fs))
{
    reader.ReadLine();
    reader.ReadLine();

    // Do stuff.
}

You can do this: 你可以这样做:

var valid = new int[] { 3, 4, 5, 6, 7 };
var lines = File.ReadAllLines("file.txt").
    Where((line, index) => valid.Contains(index + 1));

Or the opposite: 或者相反:

var invalid = new int[] { 1, 2, 8 };
var lines = File.ReadAllLines("file.txt").
    Where((line, index) => !invalid.Contains(index + 1));

If you're looking for a general way to remove the last and the first 2, you can use this: 如果您正在寻找删除最后一个和前两个的一般方法,您可以使用:

var allLines = File.ReadAllLines("file.txt");
var lines = allLines
  .Take(allLines.Length - 1)
  .Skip(2);

But from your example it seems that you're better off looking for the string pattern that you want to read from the file. 但是从你的例子来看,你最好还是寻找你想要从文件中读取的字符串模式。 Try using regexes. 尝试使用正则表达式。

string filepath = @"C:\whatever.txt";
using (StreamReader rdr = new StreamReader(filepath))
{
    rdr.ReadLine();  // ignore 1st line
    rdr.ReadLine();  // ignore 2nd line
    string fileContents = "";
    while (true)
    {
        string line = rdr.ReadLine();
        if (rdr.EndOfStream)
            break;  // finish without processing last line
        fileContents += line + @"\r\n";
    }
    Console.WriteLine(fileContents);
}

How about a general solution? 一般解决方案怎么样?

To me, the first step is to enumerate over the lines of a file (already provided by ReadAllLines , but that has a performance cost due to populating an entire string[] array; there's also ReadLines , but that's only available as of .NET 4.0). 对我来说,第一步是枚举一个文件的行(已由ReadAllLines提供,但由于填充整个string[]数组而具有性能成本;还有ReadLines ,但这只能从.NET 4.0开始提供)。

Implementing this is pretty trivial: 实现这一点非常简单:

public static IEnumerable<string> EnumerateLines(this FileInfo file)
{
    using (var reader = file.OpenText())
    {
        while (!reader.EndOfStream)
        {
            yield return reader.ReadLine();
        }
    }
}

The next step is to simply skip the first two lines of this enumerable sequence. 下一步是简单地跳过这个可枚举序列的前两行。 This is straightforward using the Skip extension method. 使用Skip扩展方法很简单。

The last step is to ignore the last line of the enumerable sequence. 最后一步是忽略可枚举序列的最后一行。 Here's one way you could implement this: 以下是实现此目的的一种方法:

public static IEnumerable<T> IgnoreLast<T>(this IEnumerable<T> source, int ignoreCount)
{
    if (ignoreCount < 0)
    {
        throw new ArgumentOutOfRangeException("ignoreCount");
    }

    var buffer = new Queue<T>();
    foreach (T value in source)
    {
        if (buffer.Count < ignoreCount)
        {
            buffer.Enqueue(value);
            continue;
        }

        T buffered = buffer.Dequeue();

        buffer.Enqueue(value);

        yield return buffered;
    }
}

OK, then. 好吧。 Putting it all together, we have: 总而言之,我们有:

var file = new FileInfo(@"path\to\file.txt");
var lines = file.EnumerateLines().Skip(2).IgnoreLast(1);

Test input (contents of file): 测试输入(文件内容):

This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.
This is line number 5.
This is line number 6.
This is line number 7.
This is line number 8.
This is line number 9.
This is line number 10.

Output (of Skip(2).IgnoreLast(1) ): 输出( Skip(2).IgnoreLast(1) ):

This is line number 3.
This is line number 4.
This is line number 5.
This is line number 6.
This is line number 7.
This is line number 8.
This is line number 9.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM