简体   繁体   English

在文本文件中的特定字符串之后读取行,然后将数据存储在列表中

[英]Reading lines after specific string in a text file then storing data in lists

I have a program that reads texts files and I'm wanting it to collect data after a certain title in the text file, in this case [HRData] . 我有一个读取文本文件的程序,并且希望它在文本文件中的某个标题(在本例中为[HRData]之后收集数据。 Once the streamreader reaches [HRData] I want it to read every line after that and store each line in a list, but allowing me to get access to the seperate numbers. 流读取器到达[HRData]我希望它读取之后的每一行并将每一行存储在列表中,但允许我访问单独的数字。

The text file is like so: 文本文件如下所示:

[HRZones]
190
175
162
152
143
133
0
0
0
0
0

[SwapTimes]

[Trip]
250
0
3978
309
313
229
504
651
//n header 
[HRData]
91  154 70  309 83  6451
91  154 70  309 83  6451
92  160 75  309 87  5687
94  173 80  309 87  5687
96  187 87  309 95  4662
100 190 93  309 123 4407
101 192 97  309 141 4915
103 191 98  309 145 5429

So referring to the text file, I want it to store the first line after [HRData] and allow me access each variable, for example 91 being [0] . 因此,参考文本文件,我希望它在[HRData]之后存储第一行,并允许我访问每个变量,例如91[0]

I have code that already stores to a list if the word matches the regex, but I do not know how to code it to read after a specific string like [HRData]. 如果单词与正则表达式匹配,我的代码已经存储到列表中,但是我不知道如何编写代码以在特定字符串(如[HRData])之后读取。

if (squareBrackets.Match(line).Success) {
 titles.Add(line);
 if (textAfterTitles.Match(line).Success) {
  textaftertitles.Add(line);

 }
}

This is my attempt so far: 到目前为止,这是我的尝试:

if (line.Contains("[HRData]")) {
 inttimes = true;
 MessageBox.Show("HRDATA Found");
 if (inttimes == true) {
  while (null != (line = streamReader.ReadLine())) {
   //ADD LINE AND BREAK UP INTO PARTS S
  }
 }
}

You can call a LINQ-friendly method File.ReadLines , then you can use LINQ to get the part you want: 您可以调用LINQ友好方法File.ReadLines ,然后可以使用LINQ来获取所需的零件:

List<string> numbers = File.ReadLines("data.txt")
                           .SkipWhile(line => line != "[HRData]") 
                           .Skip(1)
                           .SelectMany(line => line.Split())
                           .ToList();

Console.WriteLine(numbers[0]); // 91

Edit - this will give you all the numbers in one List<string> , if you want to keep the line order, use Select instead of SelectMany : 编辑-这将为您提供所有数字在一个List<string> ,如果要保持行顺序,请使用Select而不是SelectMany

List<List<string>> listsOfNums = File.ReadLines("data.txt")
                                     .SkipWhile(line => line != "[HRData]") 
                                     .Skip(1)
                                     .Select(line => line.Split().ToList())
                                     .ToList();

Note that this requires additional index to get a single number: 请注意,这需要附加索引才能获得单个数字:

Console.WriteLine(listsOfNums[0][0]); // 91

You could use a variable to track the current section: 您可以使用变量来跟踪当前部分:

var list = new List<int[]>();
using (StreamReader streamReader = ...)
{
    string line;
    string sectionName = null;
    while (null != (line = streamReader.ReadLine()))
    {
        var sectionMatch = Regex.Match(line, @"\s*\[\s*(?<NAME>[^\]]+)\s*\]\s*");
        if (sectionMatch.Success)
        {
            sectionName = sectionMatch.Groups["NAME"].Value;
        }
        else if (sectionName == "HRData")
        {
            // You can process lines inside the `HRData` section here.

            // Getting the numbers in the line, and adding to the list, one array for each line.
            var nums = Regex.Matches(line, @"\d+")
                .Cast<Match>()
                .Select(m => m.Value)
                .Select(int.Parse)
                .ToArray();

            list.Add(nums);
        }
    }
}

Presuming your current code attempt works, which I have not gone through to verify... 假设您当前的代码尝试有效,但我尚未进行验证...

You could simply do the following: 您可以简单地执行以下操作:

List<int> elements = new List<int>();
while (null != (line = streamReader.ReadLine())) 
{
    if(line.Contains("["))
    {
        //Prevent reading in the next section
        break;
    }
    string[] split = line.Split(Convert.ToChar(" "));
    //Each element in split will be each number on each line.
    for(int i=0;i<split.Length;i++)
    {
        elements.Add(Convert.ToInt32(split[i]));
    }

}

Alternatively, if you want a 2 dimensional list, such that you can reference the numbers by line, you could use a nested list. 或者,如果要使用二维列表,以便可以按行引用数字,则可以使用嵌套列表。 For each run of the outer loop, create a new list and add it to elements (elements would be List<List<int>> ). 对于外部循环的每次运行,创建一个新列表并将其添加到元素中(元素将为List<List<int>> )。

Edit 编辑

Just a note, be careful with the Convert.ToInt32() function. 只需注意一下,请谨慎使用Convert.ToInt32()函数。 It should really be in a try catch statement just in case some text is read in that isn't numeric. 它确实应该在try catch语句中,以防万一读取了一些非数字的文本。

Edit 编辑

Ok.. to make the routine more robust (per my comment below): 好..使例程更加健壮(根据我在下面的评论):

First make sure the routine doesn't go beyond your block of numbers. 首先确保例程不超出您的数字范围。 I'm not sure what is beyond the block you listed, so that will be up to you, but it should take the following form: 我不确定超出您列出的范围是什么,所以这取决于您,但是它应采用以下形式:

If(line.Contains("[") || line.Contains("]") || etc etc etc)
{
    break;
}

Next thing is pre-format your split values. 接下来是预格式化拆分值。 Inside the for statement: 在for语句中:

for(int i=0;i<split.Length;i++)
{
    string val = split[i].Trim(); //Get rid of white space
    val = val.Replace("\r\n","");  //Use one of these to trim every character.
    val = val.Replace("\n","");
    try
    {
        elements.Add(Convert.ToInt32());
    }
    catch (Exception ex)
    {
        string err = ex.Message;
        //You might try formatting the split value even more here and retry convert
    }

}

To access the individual numbers (presuming you are using a single dimension list) there are a couple ways to do this. 要访问单个数字(假设您使用的是一个维度列表),可以通过两种方法进行。 If you want to access by index value: 如果要按索引值访问:

elements.ElementAt(index)

if you want to iterate through the list of values: 如果要遍历值列表:

foreach(int val in elements)
{
}

If you need to know exactly what line the value came from, I suggest a 2d list. 如果您需要确切地知道该值来自哪一行,我建议使用二维列表。 It would be implemented as follows (I'm copying my code from the original code snippet, so assume all of the error checking is added!) 它将按以下方式实现(我正在从原始代码段复制我的代码,因此假定添加了所有错误检查!)

List<List<int>> elements = new List<List<int>>();
while (null != (line = streamReader.ReadLine())) 
{
    if(line.Contains("["))
    {
        //Prevent reading in the next section
        break;
    }
    List<int> newLine = new List<int>();
    string[] split = line.Split(Convert.ToChar(" "));
    //Each element in split will be each number on each line.
    for(int i=0;i<split.Length;i++)
    {
        newLine.Add(Convert.ToInt32(split[i]));
    }
    elements.Add(newLine);
}

Now to access each element by line: 现在按行访问每个元素:

foreach(var line in elements)
{
    //line is a List<int>
    int value = line.ElementAt(index); //grab element at index for the given line.
}

Alternatively, if you need to reference directly by line index, and column index 或者,如果您需要直接按行索引和列索引进行引用

int value = elements.ElementAt(lineIndex).ElementAt(columnIndex);

Be careful with all of these direct index references. 请小心所有这些直接索引引用。 You could pretty easily get an index out of bounds issue. 您可以很容易地使索引超出范围。

One other thing.. you should probably put a breakpoint on your Convert.ToInt statement and find what string it is breaking on. 另一件事..您可能应该在您的Convert.ToInt语句上放置一个断点,然后查找断点是什么字符串。 If you can assume that the data input will be consistent, then finding exactly what string is breaking the conversion will help you create a routine that handles the particular characters that are filtering in. I am going to guess that the method broke when it attempted to Convert the last split value to an integer, and we had not removed line endings. 如果您可以假设数据输入将是一致的,那么准确找出中断转换的字符串将有助于您创建一个例程来处理正在过滤的特定字符。我将猜测该方法在尝试执行时已中断将最后一个拆分值转换为整数,并且我们尚未删除行尾。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM