简体   繁体   中英

Splitting a string by a string and inserting into a list C#

So I'm using C# and Visual Studio. I am reading a file of students and their information. The number of students is variable, but I want to grab their information. At the moment I just want to segment the student's information based off of the string "Student ID" because each student's section starts with Student ID. I'm using ReadAllText and setting it equal to a string and then feeding that string to my function splittingStrings. The file will look like this:

student ID 1
//bunch of info

student ID 2 
//bunch of info

student ID 3 
//bunch of info
.
.
.

I'm wanting to split each segment into a list since the number of students will be unknown, and the information for each student will vary. So I looked into both Regular string split and Regex string splitting. For regular strings I tried this.

        public static List<string> StartParse = new List<string>(); 

        public static void splittingStrings(string v)
        {
            string[] DiagDelimiters = new string[] {"Student ID "};

            StartParse.Add(v.Split(DiagDelimiters, StringSplitOptions.None);   
        }

And this is what I tried with Regex:

StartParse.Add(Regex.Split("Student ID ");

I haven't used Lists before, but from what I've read they are dynamic and easy to use. My only trouble I'm getting is that all examples I see with split are in combination with an array so syntactically I'm not sure how to do a split on a string and insert it into a list. For output my goal is to have the student segments divided so that if I need to I can call a particular segment later.


Let me verify that I'm after that batch of information not the ID's alone. A lot of the questions seem to be focused on that so I felt I needed to verify that.

To those suggesting other storage bodies:

example of what list will hold:

position 0 will hold [<id> //bunch of info] 
position 1 will hold [<anotherID> //bunch of info]
.
.
.

So I'm just using the List to do multiple operations on for information that I need. The information will be FAR more manageable if I can segment them into the list as shown above. I'm aware of dictionaries, but I have to store this information either in sql tables or inside text files depending on the contents of the segments. An example would be if one segment was really funky then I would send an error report that one student's information is bad. Otherwise insert neccessary information into sql table. But I'm having to work with multiple things from the segments so I felt the List was the best way to go since I'll have to also go back and forth in the segment to cross check bits of information with earlier things in that segment I found.

There is no need to use RegEx here and I would recommend against it. Simply splitting on white space will do the trick. Lets pretend you have a list which contains each of those lines ( student ID 1 , student ID 2 , ect) you can get a list of the id's very simply like so;

  List<string> ids = students.Select(x => x.Split(' ')[2]).ToList();

The statement above essentially says, for each string in students split the string and return the third token (index 2 because it's 0 indexed). I then call ToList because Select by default returns an IEnumerable<T> but I wouldn't worry about those details just yet. If you don't have a list with each of the lines you showed the idea stays much the same, only you would add the items to you ids list one by one as you split the string. For an given string in the form of student id x I would get x on it's own with myString.Split(' ')[2] that is the basis of the expression I pass into Select .

Based on the OP's comment here is a way to get all of the data without the Student Id part of each batch.

string[] batches = input.Split(new string[] { "student id " } StringSplitOptions.RemoveEmptyEntries);

If you really need a list then you can just call ToList() and change type of batches to List<string> but that would probably just be a waste of CPU cycles.

Here's some pseudo-code, and what i'd do:

List<Integer> ids;

void ParseStudentId(string str) {
  var spl = str.split(" ");
  ids.add(Integer.parseInt(spl[spl.length-1])); // this will fetch "1" from "Student Id 1"
}

void main() {
  ParseStudentId("Student Id 1");
  ParseStudentId("Student Id 2");
  ParseStudentId("Student Id 3");

  foreach ( int id in ids )
    Console.WriteLin(id); // will result in:
                          // 1
                          // 2
                          // 3
}

forgive me. i'm a java programmer, so i'm mixing Pascal with camel casing :)

Try this one:

StartParse = new List<string>(Regex.Split(v, @"(?<!^)(?=student ID \d+)"));

(?<!^)(?=student ID \\d+) which means Splitting the string at the point student ID but its not at the beginning of the string.

Check this code

    public List<string> GetStudents(string filename)
    {
        List<string> students = new List<string>();
        StringBuilder builder = new StringBuilder();
        using (StreamReader reader = new StreamReader(filename)){
            string line = "";
            while (!reader.EndOfStream)
            {
                line  = reader.ReadLine();
                if (line.StartsWith("student ID") && builder.Length > 0)
                {
                    students.Add(builder.ToString());
                    builder.Clear();
                    builder.Append(line);
                    continue;
                }

                builder.Append(line);
            }

            if (builder.Length > 0)
                students.Add(builder.ToString());
        }

        return students;
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM