简体   繁体   中英

Linear Search Algorithm Optimization

I just finished a homework problem for Computer Science 1 (yes, it's homework, but hear me out!). Now, the assignment is 100% complete and working, so I don't need help on it. My question involves the efficiency of an algorithm I'm using (we aren't graded on algorithmic efficiency yet, I'm just really curious).

The function I'm about to present currently uses a modified version of the linear search algorithm (that I came up with, all by myself!) in order to check how many numbers on a given lottery ticket match the winning numbers, assuming that both the numbers on the ticket and the numbers drawn are in ascending order. I was wondering, is there any way to make this algorithm more efficient?

/*
 * Function: ticketCheck
 *
 * @param struct ticket
 * @param array winningNums[6]
 *
 * Takes in a ticket, counts how many numbers
 * in the ticket match, and returns the number
 * of matches.
 *
 * Uses a modified linear search algorithm,
 * in which the index of the successor to the
 * last matched number is used as the index of
 * the first number tested for the next ticket value.
 *
 * @return int numMatches
 */
int ticketCheck( struct ticket ticket, int winningNums[6] )
{
    int numMatches = 0;
    int offset = 0;
    int i;
    int j;

    for( i = 0; i < 6; i++ )
    {
        for( j = 0 + offset; j < 6; j++ )
        {
            if( ticket.ticketNum[i] == winningNums[j] )
            {
                numMatches++;
                offset = j + 1;
                break;
            }
            if( ticket.ticketNum[i] < winningNums[j] )
            {
                i++;
                j--;
                continue;
            }
        }
    }

    return numMatches;
}

It's more or less there, but not quite. In most situations, it's O(n), but it's O(n^2) if every ticketNum is greater than every winningNum. (This is because the inner j loop doesn't break when j==6 like it should, but runs the next i iteration instead.)

You want your algorithm to increment either i or j at each step, and to terminate when i==6 or j==6 . [Your algorithm almost satisfies this, as stated above.] As a result, you only need one loop:

for (i=0,j=0; i<6 && j<6; /* no increment step here */) {
    if (ticketNum[i] == winningNum[j]) {
        numMatches++;
        i++;
        j++;
    }
    else if (ticketNum[i] < winningNum[j]) {
        /* ticketNum[i] won't match any winningNum, discard it */
        i++;
    }
    else { /* ticketNum[i] > winningNum[j] */
        /* discard winningNum[j] similarly */
        j++;
    }
}

Clearly this is O(n); at each stage, it either increments i or j , so the most steps it can do is 2*n-1. This has almost the same behaviour as your algorithm, but is easier to follow and easier to see that it's correct.

You're basically looking for the size of the intersection of two sets. Given that most lottos use around 50 balls (or so), you could store the numbers as bits that are set in an unsigned long long. Finding the common numbers is then a simple matter of ANDing the two together: commonNums = TicketNums & winningNums; .

Finding the size of the intersection is a matter of counting the one bits in the resulting number, a subject that's been covered previously (though in this case, you'd use 64-bit numbers, or a pair of 32-bit numbers, instead of a single 32-bit number).

Yes, there is something faster, but probably using more memory. Make an array full of 0 in the size of the possible numbers, put a 1 on every drawn number. For every ticket number add the value at the index of that number.

 int NumsArray[MAX_NUMBER+1];
 memset(NumsArray, 0, sizeof NumsArray);

 for( i = 0; i < 6; i++ )
   NumsArray[winningNums[i]] = 1;

 for( i = 0; i < 6; i++ )
   numMatches += NumsArray[ticket.ticketNum[i]];

12 loop rounds instead of up to 36 The surrounding code left as an exercise.

EDIT: It also has the advantage of not needing to sort both set of values.

This is really only a minor change on a scale like this, but if the second loop reaches a number bigger than the current ticket number, it is already allowed to brake. Furthermore, if your seconds traverses numbers lower than your ticket number, it may update the offset even if no match is found within that iteration.

PS: Not to forget, general results on efficiency make more sense, if we take the number of balls or the size of the ticket to be variable. Otherwise it is too much dependent of the machine.

If instead of comparing the arrays of lottery numbers you were to create two bit arrays of flags -- each flag being set if it's index is in that array -- then you could perform a bitwise and on the two bit arrays (the lottery ticket and the winning number sets) and produce another bit array whose bits were flags for matching numbers only. Then count the bits set.

For many lotteries 64 bits would be enough, so a uint64_t should be big enough to cover this. Also, some architectures have instructions to count the bits set in a register, which some compilers might be able to recognize and optimize for.

The efficiency of this algorithm is based both on the range of lottery numbers (M) and the number of lottery numbers per ticket (N). The setting if the flags is O(N), while the and-ing of the two bit arrays and counting of the bits could be O(M), depending on if your M (lotto number range) is larger than the size that the target cpu can preform these operations on directly. Most likely, though, M will be small and its impact will likely be less than that of N on the performance.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM