簡體   English   中英

LINQ查詢中的C#不同

[英]C# Distinct in LINQ query

更改一些代碼后,我遇到了問題。 我的想法是這樣的:我正在計算文檔中的單詞數,但是每個文檔僅包含一個單詞的一個副本,例如:

文檔1 = Smith Smith Smith Smith Smith => Smith x1

文檔2 = Smith Alan Alan => Smith x1,Alan x1

文檔3 = John John => John x1

但鐵匠的總人數應:

史密斯(Smith)x2(在3個文檔中有2個),艾倫(Alan)x1(3個文檔中有1個),約翰(x1)(3個文檔中有1個)

我認為在我有一個單獨的distinct方法之前(它計算所有單詞,如果distinct = false ),現在它只產生1

之前的代碼:

    private Dictionary<string, int> tempDict = new Dictionary<string, int>();
    private void Splitter(string[] file)
    {              
            tempDict = file
                .SelectMany(i => File.ReadAllLines(i)
                .SelectMany(line => line.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries))                    
                .AsParallel()
                .Select(word => word.ToLower()) 
                .Distinct())
                .GroupBy(word => word)                    
                .ToDictionary(g => g.Key, g => g.Count());
    }

應該對其進行更改,以便它返回字典,但是在制作應用的過程中,將其更改為以下代碼:

private Dictionary<string, int> Splitter(string[] file, bool distinct, bool pairs)
{
    var query = file
        .SelectMany(i => File.ReadLines(i)
        .SelectMany(line => line.Split(new[] { ' '}, StringSplitOptions.RemoveEmptyEntries))
        .AsParallel()
        .Select(word => word.ToLower())
        .Where(word => !word.All(char.IsDigit)));
    if (distinct)
    {
        query = query.Distinct();
    }
    if (pairs)
    {
        var pairWise = query.Pairwise((first, second) => string.Format("{0} {1}", first, second));

        return query
                .Concat(pairWise)
                .GroupBy(word => word)
                .ToDictionary(g => g.Key, g => g.Count());
    }
    return query
        .GroupBy(word => word)
        .ToDictionary(g => g.Key, g => g.Count());           
}

還要注意query = file.Distinct(); 僅返回文檔名稱。 因此,必須有所不同。

@edit這就是我調用此方法的方式:

  private void EnterDocument(object sender, RoutedEventArgs e)
    {
        List<string> myFile= new List<string>();
        OpenFileDialog openFileDialog = new OpenFileDialog();
        openFileDialog.Multiselect = true;
        openFileDialog.Filter = "All files (*.*)|*.*|Text files (*.txt)|*.txt";
        if (openFileDialog.ShowDialog() == true)
        {
            foreach (string filename in openFileDialog.FileNames)
            {
                myFile.Add(filename);

            }
        }
        string[] myFiles= myFile.ToArray();
        myDatabase = Splitter(myFiles, true, false);
    }

Distinct()將從您的IEnumerable刪除重復項,因此請在以下操作之前調用它...

return query
    .GroupBy(word => word)
    .ToDictionary(g => g.Key, g => g.Count());  

...將產生所有唯一單詞的列表,但計數為1。

編輯:

要解決合並所有行的問題,您可以執行以下操作:

List<string> allFilesWords = new List<string>();
foreach (var filename in file)
{
    var fileQuery = File.ReadLines(filename)
        .SelectMany(line => line.Split(new[] { ' '}, StringSplitOptions.RemoveEmptyEntries))
        .AsParallel()
        .Select(word => word.ToLower())
        .Where(word => !word.All(char.IsDigit)));
    allFilesWords.AddRange(fileQuery.Distinct());
}
return allFilesWords
        .GroupBy(word => word)
        .ToDictionary(g => g.Key, g => g.Count());       

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM