简体   繁体   中英

What's the most efficient way to parse a potentially bad delimited string into a class?

I'm building a custom parser that should read a delimited list of data, and store the results in a class. My problem is, the program that generates the data doesn't always include all the delimiters.

For example, if the last 3 properties have no value, it will skip the last 3 delimiters.

I was using something like this until I noticed this quirk:

var data = message.Split(delimiter);

if (data.Length < 5)
    throw new Exception("Invalid message");

Id = data[0];
Property1 = data[1];
Property2 = data[2];
Property3 = data[3];
Property4 = data[4];

Of course, if the delimited string contains less than 5 elements, that creates a problem.

What's the best way to parse a potentially bad delimited string into a class?

I don't want to use an if statement for each property because some delimited strings contain over 50 properties.

I thought of creating an array of all the properties, and running a for-each loop on the data array, but I'm not sure the performance implications of this and would like to see if there's a better way first.

Assuming that the properties are nullable

Property1 = data.Length > 1 ? data[1] : null;
Property2 = data.Length > 2 ? data[2] : null;
Property3 = data.Length > 3 ? data[3] : null;
Property4 = data.Length > 4 ? data[4] : null;

Instead of null you can use any default value that makes sense for the properties.


EDIT:

var dataEx = new string[expectedLength];
data.CopyTo(dataEx, 0);

Property1 = dataEx[1];
Property2 = dataEx[2];
Property3 = dataEx[3];
Property4 = dataEx[4];

How about an extension method?

public static T GetByIndexOrDefault<T>(this Array array, int index)
{
    if (array == null)
    {
        return default(T);
    }

    if (index <= array.Length)
    {
        return (T)array.GetValue(index - 1);
    }

    return default(T);
}

Then:

string data = "foo1;foo2;foo3;foo4";

string[] splittedData = data.Split(';');

string e1 = splittedData.GetByIndexOrDefault<string>(1);    // foo1
string e2 = splittedData.GetByIndexOrDefault<string>(2);    // foo2
string e3 = splittedData.GetByIndexOrDefault<string>(3);    // foo3
string e4 = splittedData.GetByIndexOrDefault<string>(4);    // foo4
string e5 = splittedData.GetByIndexOrDefault<string>(5);    // null

Assuming your property naming scheme actually resembles your example, you could do this with reflection:

var data = message.Split(delimiter);
if (data.Length < 1) throw new Exception("Invalid message");
Id = data[0];
for (var i = 1; i < data.Length; i++)
{
    var property = GetType().GetProperty("Property" + i);
    property.SetValue(this, data[i], null);
}

Just make sure all of your properties have an acceptable default state, in case they don't get set by message .

Creating an array of properties would work, ie. Xander's answer, however it still isn't a solution to the problem of bad data. If you have a badly delimited field in the middle of the file, a property in the middle of your array will be faulty as well.

I don't think anything is wrong with failing when you encounter a problem though. If the message is badly formatted, that data will be bad. If you don't need to create missing fields, you can always just parse the message manually and fix the badly delimited parts.

If missing fields need to be there, some applications use algorithms to try and fix the bad data. If you think the data can be fixed (either by creating new data or massaging old data), you can create an algorithm to "guess" the missing fields.

I would consider making a lookup table of your property names, mapping the expected index of the property to its property name. Then set the properties by reflection.

        string[] propertyLookup = { "Property1", "Property2", "Property3", "Property4", "Property5" };  \\ etc etc
        string[] parsedValues = message.Split(delimiter);
        Foo newFoo = new Foo();
        Type fooType = newFoo.GetType();
        for (int i = 0; i < parsedValues.Count(); i++)
        {
            PropertyInfo prop = fooType.GetProperty(propertyLookup[i]);
            prop.SetValue(newFoo, parsedValues[i], null);
        }
using System;
using System.Windows.Forms;
using System.Reflection;
namespace DynamicProp
{
    public partial class Form1 : Form
    {
        class Messagage 
        {
            public string ID { get; set; }
            public string Property1 { get; set; }
            public string Property2 { get; set; }
            public string Property3 { get; set; }
            public string Property4 { get; set; }
        }

        public Form1()
        {
            InitializeComponent();
        }

        private void Form1_Load(object sender, EventArgs e)
        {
            string[] data = { "hasan", "osman", "ali", "veli", "deli" };

            Messagage message = new Messagage();
            PropertyInfo[] ozellikler = message.GetType().GetProperties();
            int I=0;
            foreach (PropertyInfo ozellik in ozellikler)
            {
                ozellik.SetValue(message, data[I], null);
                listBox1.Items.Add("özellik :" + ozellik.Name + "  tipi :"+ozellik.GetValue(message,null).ToString());
                I++;
            }
        }
    }
}

To provide yet another way...

If you have a default value, such as an empty string, you can create a List and use AddRange to add the values from the data string. Then, if the maximum number of fields for that particular data have not been used, use AddRange and Enumerable.Repeat to fill in the remaining values with a default.

        List<string> Results = new List<string>();
        int MaxFields = 5;
        Results.AddRange(message.Split(delimiter));
        if(Results.Count < MaxFields)
            Results.AddRange(Enumerable.Repeat(String.Empty,MaxFields - Results.Count)); 
        Id = Results[0]; 
        Property1 = Results[1];  
        Property2 = Results[2]; 
        Property3 = Results[3]; 
        Property4 = Results[4]; 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM