简体   繁体   English

将 CSV 数据导入 C# 类

[英]Importing CSV data into C# classes

I know how to read and display a line of a .csv file.我知道如何读取和显示 .csv 文件的一行。 Now I would like to parse that file, store its contents in arrays, and use those arrays as values for some classes I created.现在我想解析该文件,将其内容存储在数组中,并将这些数组用作我创建的某些类的值。

I'd like to learn how though.我想学习如何。

Here is an example:下面是一个例子:

basketball,2011/01/28,Rockets,Blazers,98,99
baseball,2011/08/22,Yankees,Redsox,4,3

As you can see, each field is separated by commas.如您所见,每个字段都用逗号分隔。 I've created the Basketball.cs and Baseball classes which is an extension of the Sport.cs class, which has the fields:我创建了 Basketball.cs 和 Baseball 类,它们是 Sport.cs 类的扩展,它具有以下字段:

private string sport;
private string date;
private string team1;
private string team2;
private string score;

I understand that this is simplistic, and that there's better ways of storing this info, ie creating classes for each team, making the date a DateType datatype, and more of the same but I'd like to know how to input this information into the classes.我知道这很简单,并且有更好的方法来存储这些信息,即为每个团队创建类,使日期成为 DateType 数据类型,等等,但我想知道如何将这些信息输入到类。

I'm assuming this has something to do with getters and setters... I've also read of dictionaries and collections, but I'd like to start simple by storing them all in arrays... (If that makes sense... Feel free to correct me).我假设这与 getter 和 setter 有关……我也读过字典和集合,但我想通过将它们全部存储在数组中来开始简单……(如果这有意义的话…… . 随时纠正我)。

Here is what I have so far.这是我到目前为止所拥有的。 All it does is read the csv and parrot out its contents on the Console:它所做的只是读取 csv 并在控制台上鹦鹉螺出其内容:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;

namespace Assign01
{
    class Program
    {
        static void Main(string[] args)
        {
            string line;
            FileStream aFile = new FileStream("../../sportsResults.csv", FileMode.Open);
            StreamReader sr = new StreamReader(aFile);

            // read data in line by line
            while ((line = sr.ReadLine()) != null)
            {
                Console.WriteLine(line);
                line = sr.ReadLine();
            }
            sr.Close();
        }
    }
}

Help would be much appreciated.帮助将不胜感激。

For a resilient, fast, and low effort solution, you can use CsvHelper which handles a lot of code and edge cases and has pretty good documentation对于一个有弹性、快速和省力的解决方案,您可以使用CsvHelper ,它处理大量代码和边缘情况,并有很好的文档

First, install the CsvHelper package on Nuget首先,在Nuget 上安装CsvHelper 包

CsvHelper nuget 下载

a) CSV with Headers a)带有标题的 CSV

If your csv has headers like this:如果您的 csv 有这样的标题:

sport,date,team 1,team 2,score 1,score 2
basketball,2011/01/28,Rockets,Blazers,98,99
baseball,2011/08/22,Yankees,Redsox,4,3

You can add attributes to your class to map the field names to your class names like this:您可以向类添加属性以将字段名称映射到类名称,如下所示:

public class SportStats
{
    [Name("sport")]
    public string Sport { get; set; }
    [Name("date")]
    public DateTime Date { get; set; }
    [Name("team 1")]
    public string TeamOne { get; set; }
    [Name("team 2")]
    public string TeamTwo { get; set; }
    [Name("score 1")]
    public int ScoreOne { get; set; }
    [Name("score 2")]
    public int ScoreTwo { get; set; }
}

And then invoke like this:然后像这样调用:

List<SportStats> records;

using (var reader = new StreamReader(@".\stats.csv"))
using (var csv = new CsvReader(reader))
{
    records = csv.GetRecords<SportStats>().ToList();
}

b) CSV without Headers b)没有标题的 CSV

If your csv doesn't have headers like this:如果您的 csv 没有这样的标题:

basketball,2011/01/28,Rockets,Blazers,98,99
baseball,2011/08/22,Yankees,Redsox,4,3

You can add attributes to your class and map to the CSV ordinally by position like this:您可以将属性添加到您的类并按如下位置顺序映射到 CSV:

public class SportStats
{
    [Index(0)]
    public string Sport { get; set; }
    [Index(1)]
    public DateTime Date { get; set; }
    [Index(2)]
    public string TeamOne { get; set; }
    [Index(3)]
    public string TeamTwo { get; set; }
    [Index(4)]
    public int ScoreOne { get; set; }
    [Index(5)]
    public int ScoreTwo { get; set; }
}

And then invoke like this:然后像这样调用:

List<SportStats> records;

using (var reader = new StreamReader(@".\stats.csv"))
using (var csv = new CsvReader(reader))
{
    csv.Configuration.HasHeaderRecord = false;
    records = csv.GetRecords<SportStats>().ToList();
}

Further Reading进一步阅读

Creating array to keep the information is not a very good idea, as you don't know how many lines will be in the input file.创建数组来保存信息不是一个好主意,因为您不知道输入文件中有多少行。 What would be the initial size of your Array ??你的 Array 的初始大小是多少? I would advise you to use for example a Generic List to keep the information (Eg List<>).我建议您使用例如通用列表来保留信息(例如列表<>)。

You can also add a constructor to your Sport Class that accepts an array (result of the split action as described in above answer.您还可以向接受数组的运动类添加一个构造函数(上述答案中描述的拆分操作的结果。

Additionally you can provide some conversions in the setters此外,您可以在 setter 中提供一些转换

public class Sport
{
    private string sport;
    private DateTime date;
    private string team1;
    private string team2;
    private string score;

    public Sport(string[] csvArray)
    {
        this.sport = csvArray[0];
        this.team1 = csvArray[2];
        this.team2 = csvArray[3];
        this.date = Convert.ToDateTime(csvArray[1]);
        this.score = String.Format("{0}-{1}", csvArray[4], csvArray[5]);
    }

Just for simplicity I wrote the Convert Method, but keep in mind this is also not a very safe way unless you are sure that the DateField always contains valid Dates and Score always contains Numeric Values.为简单起见,我编写了 Convert 方法,但请记住,除非您确定 DateField 始终包含有效日期并且 Score 始终包含数值,否则这也不是一种非常安全的方法。 You can try other safer methods like tryParse or some Exception Handling.您可以尝试其他更安全的方法,例如 tryParse 或某些异常处理。

I all honesty, it must add that the above solution is simple (as requested), on a conceptual level I would advise against it.老实说,必须补充一点,上述解决方案很简单(根据要求),在概念层面上,我建议不要这样做。 Putting the mapping logic between attributes and the csv-file in the class will make the sports-class too dependent on the file itself and thus less reusable.将属性和 csv 文件之间的映射逻辑放在类中会使运动类过于依赖文件本身,从而降低可重用性。 Any later changes in the file structure should then be reflected in your class and can often be overlooked.文件结构中的任何后续更改都应该反映在您的类中,并且通常会被忽略。 Therefore it would be wiser to put your “mapping & conversion” logic in the main program and keep your class a clean as possible因此,将你的“映射和转换”逻辑放在主程序中并尽可能保持你的类干净是更明智的

(Changed your "Score" issue by formatting it as 2 strings combined with a hyphen) (通过将其格式化为 2 个字符串并结合连字符来更改您的“分数”问题)

splitting the sting into arrays to get the data can be error prone and slow.将 sting 拆分为数组以获取数据可能容易出错且速度缓慢。 Try using an OLE data provider to read the CSV as if it were a table in an SQL database, this way you can use a WHERE clause to filter the results.尝试使用 OLE 数据提供程序读取 CSV,就像它是 SQL 数据库中的表一样,这样您就可以使用 WHERE 子句来过滤结果。

App.Config :应用程序配置

<?xml version="1.0" encoding="utf-8" ?>
<configuration>
  <connectionStrings>
    <add name="csv" providerName="System.Data.OleDb" connectionString="Provider=Microsoft.Jet.OLEDB.4.0;Data Source='C:\CsvFolder\';Extended Properties='text;HDR=Yes;FMT=Delimited';" />
  </connectionStrings>
</configuration>

program.cs :程序.cs

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data.OleDb;
using System.Configuration;
using System.Data;
using System.Data.Common;

namespace CsvImport
{
    class Stat
    {
        public string Sport { get; set; }
        public DateTime Date { get; set; }
        public string TeamOne { get; set; }
        public string TeamTwo { get; set; }
        public int Score { get; set; }
    }

    class Program
    {
        static void Main(string[] args)
        {
            ConnectionStringSettings csv = ConfigurationManager.ConnectionStrings["csv"];
            List<Stat> stats = new List<Stat>();

            using (OleDbConnection cn = new OleDbConnection(csv.ConnectionString))
            {
                cn.Open();
                using (OleDbCommand cmd = cn.CreateCommand())
                {
                    cmd.CommandText = "SELECT * FROM [Stats.csv]";
                    cmd.CommandType = CommandType.Text;
                    using (OleDbDataReader reader = cmd.ExecuteReader(CommandBehavior.CloseConnection))
                    {
                        int fieldSport = reader.GetOrdinal("sport");
                        int fieldDate = reader.GetOrdinal("date");
                        int fieldTeamOne = reader.GetOrdinal("teamone");
                        int fieldTeamTwo = reader.GetOrdinal("teamtwo");
                        int fieldScore = reader.GetOrdinal("score");

                        foreach (DbDataRecord record in reader)
                        {
                            stats.Add(new Stat
                            {
                                Sport = record.GetString(fieldSport),
                                Date = record.GetDateTime(fieldDate),
                                TeamOne = record.GetString(fieldTeamOne),
                                TeamTwo = record.GetString(fieldTeamTwo),
                                Score = record.GetInt32(fieldScore)
                            });
                        }
                    }
                }
            }

            foreach (Stat stat in stats)
            {
                Console.WriteLine("Sport: {0}", stat.Sport);
            }
        }
    }
}

Here's how the csv should look这是 csv 的外观

stats.csv :统计.csv

sport,date,teamone,teamtwo,score
basketball,28/01/2011,Rockets,Blazers,98
baseball,22/08/2011,Yankees,Redsox,4

While there are a lot of libraries that will make csv reading easy (see: here ), all you need to do right now that you have the line, is to split it.虽然有很多库可以使 csv 读取变得容易(请参阅: 此处),但您现在拥有该行所需要做的就是拆分它。

String[] csvFields = line.Split(",");

Now assign each field to the appropriate member现在将每个字段分配给适当的成员

sport = csvFields[0];
date = csvFields[1];
//and so on

This will however overwrite the values each time you read a new line, so you need to pack the values into a class and save the instances of that class to a list.但是,每次读取新行时,这都会覆盖这些值,因此您需要将这些值打包到一个类中,并将该类的实例保存到一个列表中。

Linq also has a solution for this and you can define your output as either a List or an Array. Linq 也有一个解决方案,您可以将输出定义为列表或数组。 In the example below there is a class that as the definition of the data and data types.在下面的示例中,有一个类作为数据和数据类型的定义。

var modelData = File.ReadAllLines(dataFile)
                   .Skip(1)
                   .Select(x => x.Split(','))
                   .Select(dataRow => new TestModel
                   {
                       Column1 = dataRow[0],
                       Column2 = dataRow[1],
                       Column3 = dataRow[2],
                       Column4 = dataRow[3]
                   }).ToList(); // Or you can use .ToArray()
// use "Microsoft.VisualBasic.dll"

using System;
using Microsoft.VisualBasic.FileIO;

class Program {
    static void Main(string[] args){
        using(var csvReader = new TextFieldParser(@"sportsResults.csv")){
            csvReader.SetDelimiters(new string[] {","});
            string [] fields;
            while(!csvReader.EndOfData){
                fields = csvReader.ReadFields();
                Console.WriteLine(String.Join(",",fields));//replace make instance
            }
        }
    }
}

Below is for newbie and eye catching solution that most newbie like to try and error please don;t forget to add System.Core.dll in references Import namespace in your .cs file : using System.Linq;以下是大多数新手喜欢尝试和错误的新手和引人注目的解决方案,请不要忘记在 .cs 文件中的引用导入命名空间中添加 System.Core.dll:使用 System.Linq;

Perhaps add iterator will be better code也许添加迭代器会是更好的代码

private static IEnumerable<String> GetDataPerLines()
{
    FileStream aFile = new FileStream("sportsResults.csv",FileMode.Open);             
    StreamReader sr = new StreamReader(aFile); 
    while ((line = sr.ReadLine()) != null)             
    { 
        yield return line;
    }             
    sr.Close(); 
}

static void Main(string[] args)
{
    var query = from data in GetDataPerLines()
          let splitChr = data.Split(",".ToCharArray())
                select new Sport
    {
       sport = splitChr[0],
       date = splitChr[1],.. and so on
    }

    foreach (var item in query)
    {
        Console.Writeline(" Sport = {0}, in date when {1}",item.sport,item.date);
    }
}

Maybe like this, the sample above is creating your own iteration using yield (please look at MSDN documentation for that) and create collection based on your string.也许像这样,上面的示例正在使用 yield 创建您自己的迭代(请查看 MSDN 文档)并根据您的字符串创建集合。

Let me know if I write the code wrong since I don;t have Visual studio when I write the answer.如果我写错了代码,请告诉我,因为我在写答案时没有 Visual Studio。 For your knowledge, an array one dimension like "Sport[]" will translate into CLR IEnumerable据您所知,像“Sport[]”这样的一维数组将转换为 CLR IEnumerable

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM