Phone number lookups using incoming numbers of varying lengths, with or without prefixes

Question

Systems:
-Call Center phone system: incoming calls accompanied by the source number, aka Caller ID, aka "ANI"
-SQL Server 2005 - the "data warehouse" which stores customer phone numbers ([cANI]), cust names, location, etc. in a table [CustDataByANI]
-Stored Procedure - call center software passes the Caller ID (aka ANI) to a SP as a parameter which uses it to perform a presently simple SELECT statement on the CustDataByANI table....WHERE [cANI] = @ANI.
This works wonderfully, provided the query finds an exact match between the @ANI and one of the 686K cANI values contained in the 'lookup table'. This happens only about 12% of the time.

Goal: to increase the number of successful "likely/probable" matches

Important note: we are using a global data set and cannot enforce rules regarding the length of either value (the parameter @ANI or the values within [cANI]).
Case 1:
The phone system transmits the source number '9876543210' which is used as the parameter @ANI
That exact number exists within the [cANI] column of the CustDataByANI table (record# 55555)
The Select statement returns the values from a number of other columns affiliated with record 55555 Super-easy: WHERE [cANI] = @ANI is successful.

Case 2:
@ANI = ' 1 9876543210' (same as above but with a leading '1')
No exact match found within CustDataByANI.cANI
Closest match in [cANI] is '9876543210' (still record 55555)
Even a child would recognize that the only difference from Case 1 is the presence of a 1-digit 'prefix' in the parameter @ANI - perhaps it is a long distance 'tag' or a country code.
Such prefixes could be 1 or 2 or even 3 digits in length....we can't predict. We do not want to consider prefixes longer than 3, but in this case do want to return the values from record 55555 as in Case 1.

Case 3: the 'reverse' of Case 2
@ANI = '9876543210'
No exact match found within CustDataByANI.cANI
Closest match in [cANI] is ' 1 9876543210' (record # 55555 now has a '1' prefix)
Again, we would assume that the two are of substantial equivalence. In this case, the [cANI] value contains the longer sequence due to the prefix, which could be 1 or 2 or even 3 digits in length...we cannot predict. We do not want to consider prefixes longer than 3, but in this case do want to return the values from record 55555 as in Case 1.

Again, due to the possible variability in the length of each value (@ANI and [cANI]) and my near complete lack of SQL programming, I cannot write a SELECT statement for the stored procedure which takes into account all 3 cases. Simple "LIKE" statements with wildcards seem to fail, and my head is spinning on CASE criteria, CONTAINS, and even REVERSE strategies for 'reading' both the @ANI and the cANI values in a right-to left fashion.

My dream is to return the best probable match between the two.
My Stupid Procedure is below; any and all assistance is greatly appreciated!.

BTW, my source table CustDataByANI does include a RevANI column, which is simply the cANI values in reverse. Initially I had thought that the solution might lie in reversing both the @ANI parameter value and finding the greatest match within the [RevANI] column, thereby leaving any wildcards on the right of each. But I'm still stuck and am not sure if that is the best strategy....

USE [GCC]
GO
/****** Object:  StoredProcedure [dbo].[SP_GetCustDataByANI]    Script Date: 10/07/2014 07:47:34 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

ALTER PROCEDURE [dbo].[SP_GetCustDataByANI] 
    @ANI varchar(80)
AS
BEGIN

    SET NOCOUNT ON;

--Remove leading zeros from the varchar @ANI. I chose this method rather than risking
--the undesirable introduction of exponential notation when long characters are converted to
--integers and back...

    IF ((LEN(@ANI) > 1) AND (LEFT(@ANI,1)= '0'))
    BEGIN
        SET @ANI = REPLACE(LTRIM(REPLACE(@ANI,'0',' ')),' ','0')
    END

    SELECT Id
        ,cANI
        ,cServiceClass
        ,cCompanyClass
        ,cContactName
        ,cContactDivision
        ,cContactDepartment
        ,cCompanyName
        ,cOrganizationName
        ,cContactCity
        ,cContactStateTerr
        ,cContactCountry
        ,cCompanyIsDistributor
        ,PrefAgentID
        ,PrefAgentID_SQUAL
        ,PrefRegionID_SQUAL
        ,VIP_CC
        ,VIP_TS
        ,TS_ACAT
    FROM [dbo].[CustDataByANI]
    WHERE ([cANI] = @ANI)

Answer 1

If you want to make the queries faster you can create an aditional column with the phone number in reverse order, make an index on this column, and then query for the number using a LIKE predicate, passing the searched for phone number also in reverse order. This will give make the queries as fast as possible. For example, for your sample data, you'd store this in the new ReversedPhoneNumber column:

6543211234     store as: 4321123456
16543211234    store as: 43211234561 
0016543211234  store as: 4321123456100

When you need to query by 6543211234 simply reverse it and look in the reversed column like this

WHERE ReversedPhoneNumber LIKE `6543211234%`

This would match the number in whichever format it's stored, and very quickly, because it's a simple and fast index seek. (A LIKE that resembles an "starts with" operation will seek the index to look for coincidences).

As to the additional rules that you need to apply, you know the data better than us. Simply think of all possible cases and make a few tests and you'll get the rules that you need to apply (not to speed up, bu tto ensure a correct match).

You can reverse the phone number in the ETL process.

There are missing details to give a better advice.

NOTE: if you cannot add columns and indexes to the existing tables, simply create a differente table to hold the reversed number, related to the existing table. You can also add triggers to maintain this table

Phone number lookups using incoming numbers of varying lengths, with or without prefixes

Question

1 answers

solution1
0 2014-09-26 17:45:52

Phone number lookups using incoming numbers of varying lengths, with or without prefixes

Question

1 answers

solution1 0 2014-09-26 17:45:52

solution1
0 2014-09-26 17:45:52