简体   繁体   中英

C#: Mapping a Data Type

I've got to write a horrible interface to import data into a new database from hundreds of data files from our old application that has everything hard coded (the data displayed resemble Excel spreadsheets, and it allows us to export the data to Comma Delimited Values).

I can read it all in, with the header names.

From that, I can generate a name for the column to be used in a Sql CE Database.

The data currently consists of float , int , DateTime , bit , char and string .

I've come up with a way to do this (untested on all of our data), but any help from someone that knows how to code this better would be greatly appreciated.

The code below is not necessary to read through unless someone just doesn't understand what I'm asking.

public enum MyParameterType { NA, Float, Bool, Char, Date, Int, String }

class MyParameter {

public MyParameter(string name, string value) {
  if (String.IsNullOrEmpty(name) || String.IsNullOrEmpty(value)) {
    throw new NotSupportedException("NULL values are not allowed.");
  }
  Name = name.Trim();
  Value = value.Trim();
  Type = MyParameterType.NA;
  if (-1 < Value.IndexOf('.')) { // try float
    float f;
    if (float.TryParse(Value, out f)) {
      string s = f.ToString();
      if (s == Value) {
        Parameter = new SqlCeParameter(AtName, SqlDbType.Float) { Value = f };
        Type = MyParameterType.Float;
      }
    }
  }
  if (Type == MyParameterType.NA) {
    bool b;
    if (bool.TryParse(Value, out b)) {
      Parameter = new SqlCeParameter(AtName, SqlDbType.Bit) { Value = b };
      Type = MyParameterType.Bool;
    }
  }
  if (Type == MyParameterType.NA) {
    if (Value.Length == 1) {
      char c = Value[0];
      Parameter = new SqlCeParameter(AtName, SqlDbType.Char) { Value = c };
      Type = MyParameterType.Char;
    }
  }
  if (Type == MyParameterType.NA) {
    DateTime date;
    if (DateTime.TryParse(Value, out date)) {
      Parameter = new SqlCeParameter(AtName, SqlDbType.DateTime) { Value = date };
      Type = MyParameterType.Date;
    }
  }
  if (Type == MyParameterType.NA) {
    if (50 < Value.Length) {
      Value = Value.Substring(0, 49);
    }
    Parameter = new SqlCeParameter(AtName, SqlDbType.NVarChar, 50) { Value = this.Value };
    Type = MyParameterType.String;
  }
}

public string AtName { get { return "@" + Name; } }

public string Name { get; set; }

public MyParameterType Type { get; set; }

public SqlCeParameter Parameter { get; set; }

public string Value { get; set; }

}

My biggest concern is that I don't want to mistakenly interpret one of the inputs (like a boolean value to be a char).

I am also looking for a way to compare new instances of MyParameter (ie if it is less than one type, try another type).

Bonus points for seeing some cool new expressions to generate this!

I think the problem is not in that code, it seems like a way to know because you are always receiving the params as string, i think you can try it better but in the data extraction step,

Maybe with the header columns you can "parse them" to a type..

and try

Convert.ChangeType(yourValue, typeof(string, double,etc))

Given some abstract CsvReader :

using (var reader = new CsvReader(file))
{
    TableGuess table = new TableGuess { Name = file };

    // given: IEnumerable<string> CsvReader.Header { get; }
    table.AddColumns(reader.Header);

    string[] parts;
    while (null != (parts = reader.ReadLine()))
    {
        table.AddRow(parts);
    }
}

Your ColumnGuess :

class ColumnGuess
{
    public string Name { get; set; }
    public Type Type { get; set; }
    public int Samples { get; private set; }

    public void ImproveType(string value)
    {
        if (this.Samples > 10) return;
        this.Samples++;

        float f; bool b; DateTime d; int i;
        if (Single.TryParse(value, out f))
        {
            this.Type = typeof(float);
        }
        else if (Boolean.TryParse(value, out b))
        {
            this.Type = typeof(bool);
        }
        else if (DateTime.TryParse(value, out d))
        {
            this.Type = typeof(DateTime);
        }
        else if (value.Length == 1 && this.Type == null && !Char.IsDigit(value[0]))
        {
            this.Type = typeof(char);
        }
        else if (this.Type != typeof(float) && Int32.TryParse(value, out i))
        {
            this.Type = typeof(int);
        }
    }
}

TableGuess would contain the guessed columns and the rows:

class TableGuess
{
    private List<string[]> rows = new List<string[]>();
    private List<ColumnGuess> columns;

    public string Name { get; set; }

    public void AddColumns(IEnumerable<string> columns)
    {
        this.columns = columns.Select(cc => new ColumnGuess { Name = cc })
                              .ToList();
    }

    public void AddRow(string[] parts)
    {
        for (int ii = 0; ii < parts.Length; ++ii)
        {
            if (String.IsNullOrEmpty(parts[ii])) continue;
            columns[ii].ImproveType(parts[ii]);
        }

        this.rows.Add(parts);
    }
}

You could add to TableGuess an AsDataTable() method:

public DataTable AsDataTable()
{
    var dataTable = new dataTable(this.Name);
    foreach (var column in this.columns)
    {
        dataTable.Columns.Add(new DataColumn(
            column.Name,
            column.Type ?? typeof(string)));
    }

    foreach (var row in this.rows)
    {
        object[] values = new object[dataTable.Columns.Count];
        for (int cc = 0; cc < row.Length; ++cc)
        {
            values[cc] = Convert.ChangeType(row[cc],
                dataTable.Columns[cc].DataType);
        }

        dataTable.LoadRow(values, false);
    }

    return dataTable;
}

You could use an SqlCeDataAdapter to move the data in the DataTable (after adding the table itself to the database).

How about this pseudo code - I reckon this should be fast enough for you. This is very pseudo - so "string", "char" etc. are just placeholders for an enum value or whatever else you fancy.

For first data row in data file
  For each column in row
    TypeOfCol(column) = <best first guess>
  Next
Next

For each data row in data file
  For each column in row
    If TypeOfCol(column) = "string"
      Continue For
    If TypeOfCol(column) = "char"
      If columnValue has more than one character
        TypeOfCol(column) = "string"
        Continue For
    If TypeOfCol(column) = "bit"
      If columnValue isn't 1 or 0
        TypeOfCol(column) = "int" // Might not be an int - next If will pick up on that...
    If TypeOfCol(column) = "int"
      If columnValue isn't integer
        TypeOfCol(column) = "float"
    If TypeOfCol(column) = "float"
      If columnValue isn't a float
        TypeOfCol(column) = If(columnValue has more than one character then "string" else "char")
    If TypeOfCol(column) = "datetime"
      If columnValue isn't a date/time
        TypeOfCol(column) = "string"
  Next
Next

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM