简体   繁体   English

删除 csv 文件中的行

[英]Delete rows in a csv file

I have two files: Example1.csv and Example2.csv, note they are not comma-separated, but are saved with the 'csv' extension.我有两个文件:Example1.csv 和 Example2.csv,请注意它们不是逗号分隔的,而是使用“csv”扩展名保存的。

Example 1 has 1 column which has emails address only Example 2 has many columns in which it has the column that is there in example 1 csv file.示例 1 有 1 列,其中仅包含电子邮件地址 示例 2 有许多列,其中包含示例 1 csv 文件中的列。

Example1.csv file Example1.csv 文件

emails

abc@gmail.com

jhg@yahoo.com

...

... ...

Example 2.csv例2.csv

Column1 column2 Column3 column4 emails

1 45 456 123 abc@gmail.com

2 89 898 254 jhg@yahoo.com

3 85 365 789 ...

Now i need to delete the rows in example2.csv that matches with data in example 1 file, for example: Row 1 and 2 should be removed as they both the email matches.现在我需要删除 example2.csv 中与示例 1 文件中的数据匹配的行,例如:第 1 行和第 2 行应该被删除,因为它们都与 email 匹配。

 string[] lines = File.ReadAllLines(@"C:\example2.csv");

 var emails = File.ReadAllLines(@"C:\example1.csv");

 List<string> linesToWrite = new List<string>();


 foreach (string s in lines)
 {
     String[] split = s.Split(' ');
         if (s.Contains(emails))
             linesToWrite.Remove(s);

 }
 File.WriteAllLines("file3.csv", linesToWrite);

This should work: 这应该工作:

var emails = new HashSet<string>(File.ReadAllLines(@"C:\example1.csv").Skip(1));

File.WriteAllLines("file3.csv", File.ReadAllLines("C:\example2.csv").Where(line => !emails.Contains(line.Split(',')[4]));

It reads all of file one, puts all emails into a format where lookup is easy, then goes through all lines in the second file and writes only those to disk that don't match any of the existing emails in their 5th column. 它读取所有文件,将所有电子邮件都设置为易于查找的格式,然后遍历第二个文件中的所有行,并仅将与第5列中的任何现有电子邮件都不匹配的内容写入磁盘。 You may want to expand on many parts, for example there is little to no error handling. 您可能需要扩展很多部分,例如几乎没有错误处理。 It also compares emails case-sensitive, although emails are normally not. 它还比较区分大小写的电子邮件,尽管电子邮件通常不区分大小写。

Variable line is not string, but string array, same as lines, you are reading it in the same way as lines. 可变不是字符串,而是字符串数组,与行相同,您以与行相同的方式读取它。

Also this line 也是这条线

if (s.Contains(line))

is not correct. 是不正确的。 You are trying to check if a string contains an array. 您正在尝试检查字符串是否包含数组。 If you need to check if a line contains an email from list, then this will be better: 如果您需要检查某行中是否包含来自列表的电子邮件,那么这样做会更好:

if (split.Intersect(line).Any())

So, here is the final code. 因此,这是最终代码。

var lines = File.ReadAllLines(@"C:\example2.csv");   
var line = File.ReadAllLines(@"C:\example1.csv");

var linesToWrite = new List<string>();    

foreach (var s in lines)
{
    var split = s.Split(',');
    if (split.Intersect(line).Any())
    {
        linesToWrite.Remove(s);
    }

}

File.WriteAllLines("file3.csv", linesToWrite);
static void Main(string[] args)
    {
        var Example1CsvPath = @"C:\Inetpub\Poligon\Poligon\Resources\Example1.csv";
        var Example2CsvPath = @"C:\Inetpub\Poligon\Poligon\Resources\Example2.csv";
        var Example3CsvPath = @"C:\Inetpub\Poligon\Poligon\Resources\Example3.csv";

        var EmailsToDelete = new List<string>();
        var Result = new List<string>();

        foreach(var Line in System.IO.File.ReadAllLines(Example1CsvPath))
        {
            if (!string.IsNullOrWhiteSpace(Line) && Line.IndexOf('@') > -1)
            {
                EmailsToDelete.Add(Line.Trim());
            }
        }


        foreach (var Line in System.IO.File.ReadAllLines(Example2CsvPath))
        {
            if (!string.IsNullOrWhiteSpace(Line))
            {
                var Values = Line.Split(' ');

                if (!EmailsToDelete.Contains(Values[4]))
                {
                    Result.Add(Line);                        
                }
            }
        }

        System.IO.File.WriteAllLines(Example3CsvPath, Result);

    }

I know this is 4 years-old... But I've got some ideas from this and I like to share my solution...我知道这是 4 岁...但是我从中得到了一些想法,我喜欢分享我的解决方案...

The idea behind this code is a simple CSV, with maximum of about 20 lines (reeeeally maximum), so I've decided to make something basic and not use a DB for this.这段代码背后的想法是一个简单的 CSV,最多大约 20 行(实际上最多),所以我决定做一些基本的东西,而不是为此使用数据库。

My solution is to rescan the CSV saving all variables (that is not the same that I like to delete) into a list and after scanning the CSV, it writes the list into the CSV (removing the one I've passed {textBox1} )我的解决方案是重新扫描 CSV,将所有变量(与我想删除的变量不同)保存到一个列表中,然后在扫描 CSV 后,将列表写入ZCC8D68C551C4A9A6D5313E07DEve4DEAFDZ

    List<string> _ = new();

    try {
        using (var reader = new StreamReader($"{Main.directory}\\bin\\ip.csv")) {

            while (!reader.EndOfStream) {

                var line = reader.ReadLine();
                var values = line.Split(',');

                if (values[0] == textBox1.Text || values[1] == textBox2.Text)
                    continue;

                _.Add($"{values[0]},{values[1]},{values[2]},");

            }

        }

        File.WriteAllLines($"{Main.directory}\\bin\\ip.csv", _);

    } catch (Exception f) {
        MessageBox.Show(f.Message);

    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM