简体   繁体   中英

C# / LINQ fastest way of comparing two lists and assigning value

I have made a code which basically compares two lists in C#. First list contains properties like this:

  • ItemID
  • TotalViews

First list lacks values for TotalViews so I'm assigning them from 2nd list which has these props:

  • ItemID
  • HitCount // this is property for TotalViews that needs to be assigned

The code is as following:

foreach (var item in parsedMerchantData)
{
    var itemInB = HitCountItemIDS.FirstOrDefault(x => x.ItemID == item.ItemID);
    if (itemInB != null)
    {
        if (itemInB.HitCount != -1)
        {
            item.TotalViews = itemInB.HitCount;
        }
        else
        {
            item.TotalViews = 0;
        }
    }
}

Is there any more efficient way to write this using LINQ or implementing a custom comparer which would work faster on larger lists that contains sometimes 100000 items in themselves?

This is like jdweng's answer, but slightly simpler and it won't throw an exception for missing item IDs:

var hitCountsById = HitCountItemIDS.ToDictionary(x => x.ItemID, x => x.HitCount);
foreach (var item in parsedMerchantData)
{
    int hitCount;
    // We don't care about the return value of TryGetValue here...
    hitCountsById.TryGetValue(item.ItemID, out hitCount);
    item.HitCount = hitCount == -1 ? 0 : hitCount;
}

This should be O(N+M) where N is the size of HitCountItemIDs and M is the size of parsedMerchantData ... so should as the data gets bigger, it should grow more slowly than the merge-sort approach, and is definitely simpler code. (It doesn't require comparing item ID for ordering, either - just equality.)

Code would look like below. Not sure what the type of HitCountItemID is. If it is anonymous then just make 'var dict' :

Dictionary<string, ABC_TYPE> dict = HitCountItemID.GropupBy(x => x.ItemID, y => y).ToDictionary(x => x.Key, y => y.FirstOrDefault())
foreach (var item in parsedMerchantData)
{
    var itemInB = dict[item.ItemID];
    if (itemInB != null)
    {
        if (itemInB.HitCount != -1)
        {
            item.TotalViews = itemInB.HitCount;
        }
        else
        {
            item.TotalViews = 0;
        }
    }
}

I assume You are holding 2 lists during the program run/collecting data, so You can sort them during insertion. Or if they are in DB and there is an Index on the ID it migh also work.

If so, You should be able to do just one run through each array, which would optimize the program really high (now you got about n^2 complexity depending on values), after you change You would have n.

int i = 0, j = 0;

while( i < parsedMerchantData.Count && j < HitCountItemIDS.Count)
{
    var item = parsedMerchantData[i];
    var itemInB = HitCountItemIDS[j];

    if (itemInB.ItemID == item.ItemID)
    {
        item.TotalViews = (itemInB.HitCount > 0) ? itemInB.HitCount : 0;
        i++;
        j++;
    }
    else if(itemInB.ItemID < item.ItemID)
        i++;
    else  //itemInB.ItemID > item.ItemID
        j++;
}

The code should look similar to the one above, You should add some more control about when it ends & what should happend with the rest values (this will stop once either i or j hit the end).

Here is the pseudo-code:

var arr1 = parsedMerchantData.OrderBy(x => x.ItemID).ToArray();
var arr2 = HitCountItemID.OrderBy(x => x.ItemID).ToArray();

var i, j = 0;
while(i + j < arr1.Length() + arr2.Length()) // or similar condition
{
    if (arr1[i].ItemID < arr2[j].ItemID) {
        if (i < arr1.Length() - 1) {
            i++;
        }
        continue;
    }

    if (arr1[i].ItemID > arr2[j].ItemID) {
        if (j < arr2.Length() - 1) {
            j++;
        }
        continue;
    }

    if (arr1[i].ItemID == arr2[j].ItemID) {
        arr1[i].TotalViews = arr2[j].HitCount != -1 ? arr2[j].HitCount : 0;
    }

    // Make sure you do not let i and j grow higher then lengths of arrays
}

The idea is to apply MergeSort algorithms. As for complexity, you spend O(n * log(n)) sorting each list then O(n) going trough them. The total is O(n * log(n)) and it is the fastest way I see.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM