简体   繁体   English

c#智能方法删除字符串中多个字符的出现

[英]c# smart way to delete multiple occurance of a character in a string

My program reads a file which has thousands of lines of something like this below "Timestamp","LiveStandby","Total1","Total2","Total3", etc.. each line is different What is the best way to split by , and delete the "" as well as put the values in a list 我的程序读取一个文件,其中有数千行,如下所示“Timestamp”,“LiveStandby”,“Total1”,“Total2”,“Total3”等。每行不同什么是最好的分割方法,并删除“”以及将值放在列表中

this is what I have 这就是我所拥有的

 while ((line = file.ReadLine()) != null)
  {
     List<string> title_list = new List<string>(line.Split(',')); 
  }

the step above still missing the deletion of the quotes. 上面的步骤仍然缺少删除引号。 I can do foreach but that kinda defeat the purpose of having List and Split in just 1 line. 我可以做foreach但有点打败只有1行列表和拆分的目的。 What is the best and smart way to do it? 什么是最好和最聪明的方法呢?

The best way in my opinion is to use a library that parses CSV , such as FileHelpers . 我认为最好的方法是使用解析CSV的库,例如FileHelpers

Concretely, in your case, this would be the solution using the FileHelpers library: 具体来说,在您的情况下,这将是使用FileHelpers库的解决方案:

Define a class that describes the structure of a record: 定义一个描述记录结构的类:

[DelimitedRecord(",")]
public class MyDataRecord
{
    [FieldQuoted('"')]
    public string TimeStamp;
    [FieldQuoted('"')]
    public string LiveStandby;
    [FieldQuoted('"')]
    public string Total1;
    [FieldQuoted('"')]
    public string Total2;
    [FieldQuoted('"')]
    public string Total3;
}

Use this code to parse the entire file: 使用此代码解析整个文件:

var csvEngine = new FileHelperEngine<MyDataRecord>(Encoding.UTF8)
    { 
        Options = { IgnoreFirstLines = 1, IgnoreEmptyLines = true }
    };

var parsedItems = csvEngine.ReadFile(@"D:\myfile.csv");

Please note that this code is for illustration only and I have not compiled/run it. 请注意,此代码仅供参考,我没有编译/运行它。 However, the library is pretty straightforward to use and there are good examples and documentation on the website. 但是,该库非常简单易用,网站上有很好的示例和文档。

I'm going to clarify this a bit. 我要澄清一点。 If you have a user formatted file that has a predictable format (ie the user has generated the data out of EXCEL or similar program) then you are way better off using an exising parser that is well tested. 如果您有一个具有可预测格式的用户格式化文件(即用户已经从EXCEL或类似程序生成数据),那么您最好使用经过充分测试的现有解析器。

Scenarios like the following are just a few examples that manual parsing will have problems with: 像下面这样的场景只是手动解析会遇到问题的几个例子:

"column 1", 2, 0104400, $1,300, "This is an interestion question, he said"

.. and there are more with escaping, file formats etc that can be a headache for roll your own. ..还有更多的逃避,文件格式等,这可能是你自己的头疼。

If you do that, then ensure you get one that can tollerate differences in columns per row as it can make a difference. 如果你这样做,那么确保你得到一个可以减少每行列数差异,因为它可以产生影响。

If, on the other hand, you know what's going into the data which is common in system generated files then using CSV parsers will cause more problems than they solve. 另一方面,如果你知道系统生成的文件中常见的数据是什么,那么使用CSV解析器会导致比他们解决的问题更多的问题。 For example, I have dealt with scenarios where the first part is fixed and can be strongly typed, but there are following parts in a row that are not. 例如,我已经处理了第一部分被修复并且可以强类型化的场景,但是连续的以下部分没有。 This can also happen if you're parsing flat file data in fixed width scenarios from legacy databases. 如果您从旧数据库中解析固定宽度方案中的平面文件数据,也会发生这种情况。 A csv solution makes assumptions we don't want and is not the right solution in many of those cases. csv解决方案做出了我们不想要的假设,并且在许多情况下不是正确的解决方案。

If that is the case and you just want to strip out quotes after splitting on commas, then try a bit of linq. 如果是这种情况,你只想在分割逗号后删除引号,那么尝试一下linq。 This can also be extended to replace specific characters you are worried about. 这也可以扩展为替换您担心的特定字符。

line.Split(',').Select(i => i.Replace("\"", "")).ToArray()

Hope that clears up all the conflicting advice. 希望能够清除所有相互矛盾的建议。

Keeping it simple like this should work: 保持这样简单应该工作:

List<string> strings = new List<string>();
while ((line = file.ReadLine()) != null) 
    string.AddRange(line.Replace("\"").split(',').AsEnumerable());

You can use the Array.ConvertAll() function. 您可以使用Array.ConvertAll()函数。

string line = "\"Timestamp\",\"LiveStandby\",\"Total1\",\"Total2\",\"Total3\"";

var list = new List<String>(Array.ConvertAll(line.Split(','), x=> x.Replace("\"","")));

Perform the Replace first, then Split into your List. 首先执行替换,然后拆分到列表中。 Here's your code with Replace. 这是你的代码与替换。

while ((line = file.ReadLine()) != null)   
{      
  List<string> title_list = new List<string>(line.Replace("\"", "").Split(','));    
}

Although, you're going to need a variable to hold all of the Lists, so look at using AddRange(). 虽然,您将需要一个变量来保存所有列表,但请查看使用AddRange()。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM