简体   繁体   English

根据字符串开头的字符串列表创建子列表

[英]Create sub lists from list of strings based on string starts with

Fist time posting, so forgive me on formatting... I have a text file that I've read in using File.ReadLines() and stored sequentially to a List. 第一时间发布,请原谅我格式化...我有一个使用File.ReadLines()读取的文本文件,该文件顺序存储到列表中。 I then want to find the first instance of a string that starts with "Student". 然后,我想找到以“ Student”开头的字符串的第一个实例。 Then I want to get all of the strings in the list up to the next instance of a string that starts with "Student" and stop just before it. 然后,我想获取列表中的所有字符串,直到下一个以“ Student”开头并在其之前停止的字符串的下一个实例。 Copy those strings to a sub list, and then rinse and repeat until the end of the file is reached. 将这些字符串复制到子列表,然后冲洗并重复直到到达文件末尾。

Example of text file: 文本文件示例:

something on line 1 第1行上的内容

something on line 2 第2行上的内容

Student ...: Joe Smith 学生 ...:乔·史密斯

Id...: 12345 编号...:12345

Major...: Math 专业...:数学

unknown number of more lines 其他行数未知

Student ...: Jane Smith 学生 ...:简·史密斯

Id...: 54321 编号...:54321

Major...: Nursing 专业...:护理

more lines 多行

Student ...: John Doe 学生 ...:John Doe

Id...: 11223 编号...:11223

Major: Anatomy 专业:解剖学

even more lines. 甚至更多的行。

I'd like the list of lines per student to look like this: 我希望每个学生的行列表如下所示:

Student 1 学生1

Student...: Joe Smith 学生...:乔·史密斯

Id...: 12345 编号...:12345

Major...: Math 专业...:数学

unknown number of more lines 其他行数未知

Student 2 学生2

Student...: Jane Smith 学生...:简·史密斯

Id...: 54321 编号...:54321

Major...: Nursing 专业...:护理

more lines 多行

I've used a foreach to iterate the lines. 我使用了foreach来迭代这些行。 Each line is added to a new list. 每行添加到一个新列表。 When I find a string that starts with "Student" then I create a new student object and store those lines in the sub list to it. 当我找到一个以“学生”开头的字符串时,我将创建一个新的学生对象并将这些行存储在子列表中。 Then I clear the sublist and then continue on with the foreach, creating new student objects. 然后清除子列表,然后继续进行foreach,创建新的Student对象。

Current issues I miss the last student. 时事我想念最后一个学生。 I know I could have the if statement that checks if the current line starts with "Student" to include checking if the current line is the last line in the list, but I feel there has to be a better/fast way to do this. 我知道我可以使用if语句检查当前行是否以“ Student”开头,以包括检查当前行是否为列表中的最后一行,但是我觉得必须有一种更好/更快的方法来做到这一点。
I have to add the && lines.Count > 3 because there are a few lines before the first instance of "Student" which I want to skip. 我必须添加&& lines.Count> 3,因为在我要跳过的“ Student”的第一个实例之前有几行。

Linq examples would be greatly appreciated. Linq的例子将不胜感激。

List<Student> students = new List<Student>();
List<string> lines = File.ReadLines(args[0]).ToList();  
List<string> student_lines = new List<string>();
foreach(string line in lines) 
{ 
    if(line.StartsWith("Student...", StringComparison.OrdinalIgnoreCase) && lines.Count > 3) 
    {
        students.Add(new Student(student_lines)); 
        student_lines.Clear(); 
    } 
    lines.Add(line)
}

Try this: 尝试这个:

var results =
    lines
        .Aggregate(new[] { new List<string>() }.ToList(), (a, x) =>
        {
            if (x.StartsWith("Student"))
            {
                a.Add(new List<string>());
            }
            a.Last().Add(x);
            return a;
        })
        .Skip(1)
        .Select(x => new Student(x))
        .ToList();

From your sample data I get this: 从您的样本数据中我得到:

结果

Something like this 像这样

if (args.Length < 1) return;                // optional check if any args

string text = File.ReadAllText(args[0]);

string[] parts = text.Split(new[] { "Student" }, 0);

string[][] lines = Array.ConvertAll(parts, part => part.Split(new[] { '\r', '\n' }, 1));

Debug.Print(lines[1][1]);        // "Id...: 12345"

It is really vague what you are trying to do but here is a very basic example of what I think you are doing. 您正在尝试做的事情确实很模糊,但这是我认为您正在做的事情的一个非常基本的例子。

Now the only thing you need to do is change the fileName param. 现在,您唯一需要做的就是更改fileName参数。

static void Main(string[] args)
{
    string line = null;
    Student student = null;
    IList<Student> students = new List<Student>();
    using (var fileReader = new StreamReader(fileName))
    {
        while ((line = fileReader.ReadLine()) != null)
        {
            if (string.IsNullOrWhiteSpace(line))
                continue; //continue execution on extra lines
            if(line.StartsWith("Student...", StringComparison.CurrentCultureIgnoreCase))
            {
                student = new Student();
                students.Add(student);
            }

            if (student != null)
                student.Lines.Add(line);


        }

        fileReader.Close();
    }
}

class Student
{
    public IList<string> Lines { get; } = new List<string>();
}

In a nutshell all this little program does is starts reading a file line by line (note while ((line = fileReader.ReadLine()) != null) will set the value of line to the next line in the file or null when the file has been completely read. 简而言之所有这个小程序的作用是通过启动线读取文件中的行(注意while ((line = fileReader.ReadLine()) != null)将值设置line到下一行的文件或null当文件已被完全读取。

Next it checks for empty lines (we dont care about empty lines so we move on) 接下来,它检查空行(我们不在乎空行,因此我们继续前进)

We then check if the line startsith Student... now we are using StringComparison.CurrentCultureIgnoreCase so we can compare case insensitive. 然后,我们检查该line是否从Student...现在我们正在使用StringComparison.CurrentCultureIgnoreCase以便我们可以区分大小写。

If this is a Student line, we then create a new student and Add() it to our students collection. 如果这是学生行,则我们创建一个新学生并将其Add()到我们的students集合中。

Finally as long as student is not null we can add the contents of line to the student .Lines property. 最后,只要student不为null,我们就可以将line的内容添加到student .Lines属性中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从字符串中提取子字符串 - Extract sub Strings From a String Linq查询,从一个列表属性中选择一切,该属性以另一列表中的字符串开头 - Linq query, select everything from one lists property that starts with a string in another list 包含int和字符串的列表列表,需要根据TrackBar输出具有最高int值的X字符串(不是排序列表) - A List of Lists containing int and string, need to output X strings with highest int values based on TrackBar (not a sorted list) 如何检查上下文请求路径是否以字符串列表中的给定字符串开头? - How to check if a context request path starts with a given string from a list of strings? 根据分隔符从现有列表创建较小的列表 - Create smaller lists from existing list based on delimiter 从清单中删除 <T> 基于子字符串匹配 - Remove from List<T> based on sub-string match 根据与目标字符串的差异对字符串列表进行排序的最佳方法? - Best way of sorting a list of strings based on difference from a target string? 如何从列表中删除其字符串包含子字符串的列表元素 - How to remove elements of a list where its string contains sub strings from another list 根据用户c#输入的字符串创建字符串列表 - Create List of strings from the string inputted by the user c# 在C#List中从字符串部分创建逗号分隔的字符串 - Create comma separated string from portion of strings in C# List
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM