從字符串列表中獲取唯一項

Question

我有一個非常簡單的文本文件解析應用程序，該應用程序搜索電子郵件地址，如果找到，則會添加到列表中。

當前列表中有重復的電子郵件地址，我正在尋找一種將列表縮小為僅包含不同值的快速方法-而不是一個一個地遍歷它們：)

這是代碼-

var emailLines = new List<string>();
using (var stream = new StreamReader(@"C:\textFileName.txt"))
{
    while (!stream.EndOfStream)
    {
        var currentLine = stream.ReadLine();

        if (!string.IsNullOrEmpty(currentLine) && currentLine.StartsWith("Email: "))
        {
            emailLines.Add(currentLine);
        }
    }
}

Answer 1

如果您只需要唯一的項目，則可以使用將項目添加到HashSet而不是List 。 請注意， HashSet沒有隱含順序。 如果需要有序集，則可以改用SortedSet 。

var emailLines = new HashSet<string>();

這樣就不會有重復。

要從List刪除重復項，可以使用IEnumerable.Distinct() ：

IEnumerable<string> distinctEmails = emailLines.Distinct();

Answer 2

嘗試以下

var emailLines = File.ReadAllLines(@"c:\textFileName.txt")
  .Where(x => !String.IsNullOrEmpty(x) && x.StartsWith("Email: "))
  .Distinct()
  .ToList();

這種方法的缺點是它將文件中的所有行讀入string[] 。 這會立即發生，並且對於大文件將創建相應的大數組。 通過使用一個簡單的迭代器，可以找回行的惰性讀取。

public static IEnumerable<string> ReadAllLinesLazy(string path) { 
  using ( var stream = new StreamReader(path) ) {
    while (!stream.EndOfStream) {
      yield return stream.ReadLine();
    }
  }
}

然后可以將上面的File.ReadAllLines調用替換為對該函數的調用

Answer 3

IEnumerable / Linq的優點（適用於大型文件，只有匹配的行才會保留在內存中）：

// using System.Linq;

var emailLines = ReadFileLines(@"C:\textFileName.txt")
    .Where(line => currentLine.StartsWith("Email: "))
    .Distinct()
    .ToList();

public IEnumerable<string> ReadFileLines(string fileName)
{
    using (var stream = new StreamReader(fileName))
    {
        while (!stream.EndOfStream)
        {
            yield return stream.ReadLine();
        }
    }
}

從字符串列表中獲取唯一項

問題描述

3 個解決方案

解決方案1
7 2010-09-24 03:58:28

解決方案2
3 已采納 2010-09-24 04:02:18

解決方案3
1 2010-09-24 04:05:16

從字符串列表中獲取唯一項

問題描述

3 個解決方案

解決方案1 7 2010-09-24 03:58:28

解決方案2 3 已采納 2010-09-24 04:02:18

解決方案3 1 2010-09-24 04:05:16

解決方案1
7 2010-09-24 03:58:28

解決方案2
3 已采納 2010-09-24 04:02:18

解決方案3
1 2010-09-24 04:05:16