简体   繁体   中英

Finding an integer sum in an array of 1 000 000

Given a large list of integers (more than 1 000 000 values) find how many ways there are of selecting two of them that add up to 0.... Is the question

What I have done is create a positive random integer list:

Random pos = new Random();
int POSNO = pos.Next(1, 1000000);
lstPOS.Items.Add(POSNO);
lblPLus.Text = lstPOS.Items.Count.ToString();
POSCount++;

And created a negative list:

Random neg = new Random();
int NEGNO = neg.Next(100000, 1000000);
lstNEG.Items.Add("-" + NEGNO);
lblNegative.Text = lstNEG.Items.Count.ToString();
NegCount++;

To do the sum checking I am using:

foreach (var item in lstPOS.Items)
{
    int POSItem = Convert.ToInt32(item.ToString());
    foreach (var negItem in lstNEG.Items)
    {
        int NEGItem = Convert.ToInt32(negItem.ToString());
        int Total = POSItem - NEGItem;
        if (Total == 0)
        {
            lstADD.Items.Add(POSItem + "-" + NEGItem + "=" + Total);
            lblAddition.Text = lstADD.Items.Count.ToString();
        }
    }
}

I know this is not the fastest route. I have considered using an array. Do you have any suggestions?

Let's see; your array is something like this:

  int[] data = new int[] {
    6, -2, 3, 2, 0, 0, 5, 7, 0, -2
  };

you can add up to zero in two different ways:

  1. a + (-a) // positive + negative
  2. 0 + 0 // any two zeros

in the sample above there're five pairs:

  -2 + 2 (two pairs): [1] + [3] and [3] + [9]
   0 + 0 (three pairs): [4] + [5], [4] + [8] and [5] + [8]

So you have to track positive/negative pairs and zeros. The implementation

 Dictionary<int, int> positives = new Dictionary<int, int>();
 Dictionary<int, int> negatives = new Dictionary<int, int>(); 
 int zeros = 0;

 foreach(var item in data) {
   int v;

   if (item < 0) 
     if (negatives.TryGetValue(item, out v))     
       negatives[item] = negatives[item] + 1;
     else
       negatives[item] = 1;  
   else if (item > 0) 
     if (positives.TryGetValue(item, out v))     
       positives[item] = positives[item] + 1;
     else
       positives[item] = 1;  
   else
     zeros += 1;
 } 

 // zeros: binomal coefficent: (2, zeros)
 int result = zeros * (zeros - 1) / 2;

 // positive/negative pairs
 foreach (var p in positives) {
   int n;

   if (negatives.TryGetValue(-p.Key, out n)) 
     result += n * p.Value; 
 } 

 // Test (5)
 Console.Write(result); 

Note, that there's no sorting, and dictionaries (ie hash tables ) are used for positives and negatives so the execution time will be linear , O(n) ; the dark side of the implementation is that two additional structures (ie additional memory) required. In your case (millions integers only - Megabytes) you have that memory.

Edit: terser, but less readable Linq solution:

  var dict = data
    .GroupBy(item => item)
    .ToDictionary(chunk => chunk.Key, chunk => chunk.Count());

  int result = dict.ContainsKey(0) ? dict[0] * (dict[0] - 1) / 2 : 0;

  result += dict
    .Sum(pair => pair.Key > 0 && dict.ContainsKey(-pair.Key) ? pair.Value * dict[-pair.Key] : 0);

Fastest way without sorting!.

First of all you know that the sum of two integers are only 0 when they have equal absolute value but one is negative and the other is positive. So you dont need to sort. what you need is to Intersect positive list with negative list (by comparing absolute value). the result is numbers that ended up 0 sum.

Intersect has time complexity of O(n+m) where n is size of first list and m is size of second one.

private static void Main(string[] args)
{
    Random random = new Random();

    int[] positive = Enumerable.Range(0, 1000000).Select(n => random.Next(1, 1000000)).ToArray();
    int[] negative = Enumerable.Range(0, 1000000).Select(n => random.Next(-1000000, -1)).ToArray();

    var zeroSum = positive.Intersect(negative, new AbsoluteEqual());

    foreach (var i in zeroSum)
    {
        Console.WriteLine("{0} - {1} = 0", i, i);
    }
}

You also need to use this IEqualityComparer.

public class AbsoluteEqual : IEqualityComparer<int>
{
    public bool Equals(int x, int y)
    {
        return (x < 0 ? -x : x) == (y < 0 ? -y : y);
    }

    public int GetHashCode(int obj)
    {
        return obj < 0 ? (-obj).GetHashCode() : obj.GetHashCode();
    }
}

You tried to avoid check two numbers that are close (1, 2 are close, 3, 4 are close), but you didn't avoid check like (-100000, 1), (-1, 100000). Time complexity is O(n^2). To avoid that you need to sort them first, then search from two direction.

var random = new Random();
var input = Enumerable.Range(1, 100).Select(_ => random.Next(200) - 100).ToArray();

Array.Sort(input); // This causes most computation. Time Complexity is O(n*log(n));
var expectedSum = 0;
var i = 0;
var j = input.Length - 1;
while (i < j) // This has liner time complexity O(n);
{
    var result = input[i] + input[j];
    if(expectedSum == result)
    {
        var anchori = i;
        while (i < input.Length && input[i] == input[anchori] )
        {
            i++;
        }
        var anchorj = j;
        while (j >= 0 && input[j] == input[anchorj])
        {
            j--;
        }
        // Exclude (self, self) combination
        Func<int, int, int> combination = (n, k) =>
        {
            var mink = k * 2 < n ? k : n - k;
            return mink == 0 ? 1 
                : Enumerable.Range(0, mink).Aggregate(1, (x, y) => x * (n - y)) 
                 / Enumerable.Range(1, mink).Aggregate((x, y) => x * y);
        };
        var c = i < j ? (i - anchori) * (anchorj - j) : combination(i - anchori, 2);
        for (int _ = 0; _ < c; _++)
        {
            // C# 6.0 String.Format
            Console.WriteLine($"{input[anchori]}, {input[anchorj]}");
        }
    }
    else if(result < expectedSum) {
        i++;
    }
    else if(result > expectedSum) {
        j--;
    }
}

Here is another solution using (huh) LINQ. Hope the code is self explanatory

First some data

var random = new Random();
var data = new int[1000000];
for (int i = 0; i < data.Length; i++) data[i] = random.Next(-100000, 100000);

And now the solution

var result = data
    .Where(value => value != int.MinValue)
    .GroupBy(value => Math.Abs(value), (key, values) =>
    {
        if (key == 0)
        {
            var zeroCount = values.Count();
            return zeroCount * (zeroCount - 1) / 2;
        }
        else
        {
            int positiveCount = 0, negativeCount = 0;
            foreach (var value in values)
                if (value > 0) positiveCount++; else negativeCount++;
            return positiveCount * negativeCount;
        }
    })
    .Sum();

Theoretically the above should have O(N) time and O(M) space complexity, where M is the count of the unique absolute values in the list.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM