简体   繁体   中英

How to split string containing multiple delimeters of each and every line of text file?

This is the Input my file contains:

50|Hallogen|Mercury|M:4;C:40;A:1
90|Oxygen|Mars|M:10;C:20;A:00
5|Hydrogen|Saturn|M:33;C:00;A:3

Now i want to split each and every line of my text file and store in my class file like :

Expected output :

Planets[0]:
{
   Number:50
   name: Hallogen
   object:Mercury
   proportion[0]:
             {
                 Number:4
             },
    proportion[1]:
             {
                 Number:40
             },
proportion[2]:
             {
                 Number:1
             }
}

etc........

My class file to store all this values:

public class Planets
    {
        public int Number { get; set; }  //This field points to first cell of every row.output 50,90,5
        public string name { get; set; } //This field points to Second cell of every row.output Hallogen,Oxygen,Hydrogen
        public string object { get; set; } ////This field points to third cell of every row.output Mercury,Mars,Saturn
        public List<proportion> proportion { get; set; } //This will store all proportions with respect to planet object.
         //for Hallogen it will store 4,40,1.Just store number.ignore M,C,A initials.
         //for oxygen it will store 10,20,00.Just store number.ignore M,C,A initials.
    }

    public class proportion
    {
        public int Number { get; set; } 
    }

This is what i have done:

 List<Planets> Planets = new List<Planets>();
                        using (StreamReader sr = new StreamReader(args[0]))
                        {
                            String line;
                            while ((line = sr.ReadLine()) != null)
                            {
                                string[] parts = Regex.Split(line, @"(?<=[|;-])");
                                foreach (var item in parts)
                                {
                                     var Obj = new Planets();//Not getting how to store it but not getting proper output in parts
                                }

                               Console.WriteLine(line);
                            }
                        }

To my understanding, multiple delimiters are maintained to have a nested structure.

You need to split the whole string first based on pipe, followed by semi colon and lastly by colon.

The order of splitting here is important. I don't think you can have all the tokens at once by splitting with all 3 delimiters.

Try following code for same kind of data

var values = new List<string>
{
     "50|Hallogen|Mercury|M:4;C:40;A:1",
     "90|Oxygen|Mars|M:10;C:20;A:00",
     "5|Hydrogen|Saturn|M:33;C:00;A:3"
};
foreach (var value in values)
{
     var pipeSplitted = value.Split('|');
     var firstNumber = pipeSplitted[0];
     var name = pipeSplitted[1];
     var objectName = pipeSplitted[2];
     var semiSpltted = value.Split(';');
     var secondNumber = semiSpltted[0].Split(':')[1];
     var thirdNumber = semiSpltted[1].Split(':')[1];
     var colenSplitted = value.Split(':');
     var lastNumber = colenSplitted[colenSplitted.Length - 1];
}

在此处输入图片说明

If I understand correctly, your input is well formed. In this case you could use something like this:

string[] parts = Regex.Split(line, @"[|;-]");
var planet =  new Planets(parts);


...

public Planets(string[] parts) {
    int.TryParse(parts[0], this.Number);
    this.name = parts[1];
    this.object = parts[2];
    this.proportion = new List<proportion>();
    Regex PropRegex = new Regex("\d+");
    for(int i = 3; i < parts.Length; i++){
        Match PropMatch = PropRegex.Match(part[i]);
        if(PropMatch.IsMatch){
            this.proportion.Add(int.Parse(PropMatch.Value));
        }
    }

}

Without you having to change any of your logic in "Planets"-class my fast solution to your problem would look like this:

List<Planets> Planets = new List<Planets>();
                        using (StreamReader sr = new StreamReader(args[0]))
                        {
                            String line;
                            while ((line = sr.ReadLine()) != null)
                            {
                                Planets planet = new Planets();
                                String[] parts = line.Split('|');
                                planet.Number = Convert.ToInt32(parts[0]);
                                planet.name = parts[1];
                                planet.obj = parts[2];

                                String[] smallerParts = parts[3].Split(';');
                                planet.proportion = new List<proportion>();
                                foreach (var item in smallerParts)
                                {
                                    proportion prop = new proportion();
                                    prop.Number =                                    
                                    Convert.ToInt32(item.Split(':')[1]);
                                    planet.proportion.Add(prop);
                                }
                                Planets.Add(planet);
                            }
                        }

Oh before i forget it, you should not name your property of class Planets "object" because "object" is a keyword for the base class of everything, use something like "obj", "myObject" ,"planetObject" just not "object" your compiler will tell you the same ;)

The most straigtforward solution is to use a regex where every (sub)field is matched inside a group

var subjectString = @"50|Hallogen|Mercury|M:4;C:40;A:1
90|Oxygen|Mars|M:10;C:20;A:00
5|Hydrogen|Saturn|M:33;C:00;A:3";

    Regex regexObj = new Regex(@"^(.*?)\|(.*?)\|(.*?)\|M:(.*?);C:(.*?);A:(.*?)$", RegexOptions.Multiline);
    Match match = regexObj.Match(subjectString);
    while (match.Success) {

        match.Groups[1].Value.Dump();
        match.Groups[2].Value.Dump();
        match.Groups[3].Value.Dump();
        match.Groups[4].Value.Dump();
        match.Groups[5].Value.Dump();
        match.Groups[6].Value.Dump();

        match = match.NextMatch();
    } 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM