简体   繁体   English

用字符串分割字符串并插入列表C#

[英]Splitting a string by a string and inserting into a list C#

So I'm using C# and Visual Studio. 所以我正在使用C#和Visual Studio。 I am reading a file of students and their information. 我正在阅读学生及其信息档案。 The number of students is variable, but I want to grab their information. 学生人数是可变的,但是我想获取他们的信息。 At the moment I just want to segment the student's information based off of the string "Student ID" because each student's section starts with Student ID. 目前,我只想根据字符串“学生ID”对学生的信息进行细分,因为每个学生的部分均以学生ID开头。 I'm using ReadAllText and setting it equal to a string and then feeding that string to my function splittingStrings. 我正在使用ReadAllText并将其设置为一个字符串,然后将该字符串提供给我的函数splittingStrings。 The file will look like this: 该文件将如下所示:

student ID 1
//bunch of info

student ID 2 
//bunch of info

student ID 3 
//bunch of info
.
.
.

I'm wanting to split each segment into a list since the number of students will be unknown, and the information for each student will vary. 我想将每个细分分成一个列表,因为学生人数未知,并且每个学生的信息也会有所不同。 So I looked into both Regular string split and Regex string splitting. 因此,我研究了常规字符串拆分和Regex字符串拆分。 For regular strings I tried this. 对于常规字符串,我尝试了这一点。

        public static List<string> StartParse = new List<string>(); 

        public static void splittingStrings(string v)
        {
            string[] DiagDelimiters = new string[] {"Student ID "};

            StartParse.Add(v.Split(DiagDelimiters, StringSplitOptions.None);   
        }

And this is what I tried with Regex: 这是我用正则表达式尝试的:

StartParse.Add(Regex.Split("Student ID ");

I haven't used Lists before, but from what I've read they are dynamic and easy to use. 我以前没有使用过Lists,但是据我了解,它们是动态的,易于使用。 My only trouble I'm getting is that all examples I see with split are in combination with an array so syntactically I'm not sure how to do a split on a string and insert it into a list. 我遇到的唯一麻烦是,我在split上看到的所有示例都与数组组合在一起,因此从语法上我不确定如何对字符串进行拆分并将其插入列表。 For output my goal is to have the student segments divided so that if I need to I can call a particular segment later. 为了输出,我的目标是对学生细分进行划分,以便以后需要时可以调用特定的细分。


Let me verify that I'm after that batch of information not the ID's alone. 让我验证我是在那一批信息之后,而不是ID本身。 A lot of the questions seem to be focused on that so I felt I needed to verify that. 许多问题似乎都针对此,所以我觉得我需要验证一下。

To those suggesting other storage bodies: 对于那些建议其他存储体的人:

example of what list will hold: 列表内容的示例:

position 0 will hold [<id> //bunch of info] 
position 1 will hold [<anotherID> //bunch of info]
.
.
.

So I'm just using the List to do multiple operations on for information that I need. 因此,我只是使用列表对我需要的信息进行多项操作。 The information will be FAR more manageable if I can segment them into the list as shown above. 如果我可以将信息细分为如上所示的列表,则该信息将更易于管理。 I'm aware of dictionaries, but I have to store this information either in sql tables or inside text files depending on the contents of the segments. 我知道字典,但是我必须根据段的内容将此信息存储在sql表中或文本文件中。 An example would be if one segment was really funky then I would send an error report that one student's information is bad. 一个例子是,如果一个段确实很时髦,那么我会发送一个错误报告,指出一个学生的信息不好。 Otherwise insert neccessary information into sql table. 否则,将必要的信息插入sql表。 But I'm having to work with multiple things from the segments so I felt the List was the best way to go since I'll have to also go back and forth in the segment to cross check bits of information with earlier things in that segment I found. 但是我必须处理细分市场中的多项内容,因此我觉得列表是最好的选择,因为我还必须在细分市场中来回移动,以与该细分市场中的较早事物进行交叉检查我发现。

There is no need to use RegEx here and I would recommend against it. 这里没有必要使用RegEx,我建议不要使用它。 Simply splitting on white space will do the trick. 只需在空白处拆分即可解决问题。 Lets pretend you have a list which contains each of those lines ( student ID 1 , student ID 2 , ect) you can get a list of the id's very simply like so; 假设您有一个包含每行的列表( student ID 1student ID 2等),您可以像这样非常简单地获得ID的列表;

  List<string> ids = students.Select(x => x.Split(' ')[2]).ToList();

The statement above essentially says, for each string in students split the string and return the third token (index 2 because it's 0 indexed). 上面的语句本质上说,对于学生中的每个字符串,请分割字符串并返回第三个标记(索引2,因为索引为0)。 I then call ToList because Select by default returns an IEnumerable<T> but I wouldn't worry about those details just yet. 然后我调用ToList因为默认情况下Select返回IEnumerable<T>但我现在不会担心这些详细信息。 If you don't have a list with each of the lines you showed the idea stays much the same, only you would add the items to you ids list one by one as you split the string. 如果没有每行的列表,则表明该想法保持不变,只有在拆分字符串时,才将项逐个添加到ids列表中。 For an given string in the form of student id x I would get x on it's own with myString.Split(' ')[2] that is the basis of the expression I pass into Select . 对于以student id x形式的给定字符串,我将通过myString.Split(' ')[2]自行获得x ,这是我传递给Select的表达式的基础。

Based on the OP's comment here is a way to get all of the data without the Student Id part of each batch. 根据OP的评论,这里是一种获取所有数据而无需每批Student Id的方法。

string[] batches = input.Split(new string[] { "student id " } StringSplitOptions.RemoveEmptyEntries);

If you really need a list then you can just call ToList() and change type of batches to List<string> but that would probably just be a waste of CPU cycles. 如果您确实需要列表,则可以只调用ToList()并将batches类型更改为List<string>但这可能只会浪费CPU周期。

Here's some pseudo-code, and what i'd do: 这是一些伪代码,以及我要做什么:

List<Integer> ids;

void ParseStudentId(string str) {
  var spl = str.split(" ");
  ids.add(Integer.parseInt(spl[spl.length-1])); // this will fetch "1" from "Student Id 1"
}

void main() {
  ParseStudentId("Student Id 1");
  ParseStudentId("Student Id 2");
  ParseStudentId("Student Id 3");

  foreach ( int id in ids )
    Console.WriteLin(id); // will result in:
                          // 1
                          // 2
                          // 3
}

forgive me. 原谅我。 i'm a java programmer, so i'm mixing Pascal with camel casing :) 我是一名Java程序员,所以我将Pascal与骆驼肠衣混在一起:)

Try this one: 试试这个:

StartParse = new List<string>(Regex.Split(v, @"(?<!^)(?=student ID \d+)"));

(?<!^)(?=student ID \\d+) which means Splitting the string at the point student ID but its not at the beginning of the string. (?<!^)(?=student ID \\d+) ,这意味着将字符串拆分为点student ID而不是字符串的开头。

Check this code 检查此代码

    public List<string> GetStudents(string filename)
    {
        List<string> students = new List<string>();
        StringBuilder builder = new StringBuilder();
        using (StreamReader reader = new StreamReader(filename)){
            string line = "";
            while (!reader.EndOfStream)
            {
                line  = reader.ReadLine();
                if (line.StartsWith("student ID") && builder.Length > 0)
                {
                    students.Add(builder.ToString());
                    builder.Clear();
                    builder.Append(line);
                    continue;
                }

                builder.Append(line);
            }

            if (builder.Length > 0)
                students.Add(builder.ToString());
        }

        return students;
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM