简体   繁体   English

使用不同长度的传入号码(带或不带前缀)查找电话号码

[英]Phone number lookups using incoming numbers of varying lengths, with or without prefixes

Systems: 系统:
-Call Center phone system: incoming calls accompanied by the source number, aka Caller ID, aka "ANI" -呼叫中心电话系统:传入呼叫以及源号码,主叫方ID(又称“ ANI”)
-SQL Server 2005 - the "data warehouse" which stores customer phone numbers ([cANI]), cust names, location, etc. in a table [CustDataByANI] -SQL Server 2005-“数据仓库”,用于在表[CustDataByANI]中存储客户电话号码([cANI]),客户名称,位置等
-Stored Procedure - call center software passes the Caller ID (aka ANI) to a SP as a parameter which uses it to perform a presently simple SELECT statement on the CustDataByANI table....WHERE [cANI] = @ANI. -存储过程-呼叫中心软件将呼叫者ID(又称ANI)作为参数传递给SP,该参数使用它在CustDataByANI表上执行当前简单的SELECT语句。...WHERE [cANI] = @ANI。
This works wonderfully, provided the query finds an exact match between the @ANI and one of the 686K cANI values contained in the 'lookup table'. 如果查询找到@ANI与“查找表”中包含的686K cANI值之一之间的精确匹配,则此方法效果很好。 This happens only about 12% of the time. 这种情况仅发生大约12%的时间。

Goal: to increase the number of successful "likely/probable" matches 目标:增加成功的“可能/可能”比赛的次数

Important note: we are using a global data set and cannot enforce rules regarding the length of either value (the parameter @ANI or the values within [cANI]). 重要说明:我们使用的是全局数据集,因此无法执行有关任一值(参数@ANI或[cANI]中的值)长度的规则。
Case 1: 情况1:
The phone system transmits the source number '9876543210' which is used as the parameter @ANI 电话系统发送源号码“ 9876543210”,用作参数@ANI
That exact number exists within the [cANI] column of the CustDataByANI table (record# 55555) 该确切数字存在于CustDataByANI表的[cANI]列中(记录号55555)
The Select statement returns the values from a number of other columns affiliated with record 55555 Super-easy: WHERE [cANI] = @ANI is successful. Select语句从与记录55555 Super-easy关联的许多其他列中返回值:WHERE [cANI] = @ANI成功。

Case 2: 情况2:
@ANI = ' 1 9876543210' (same as above but with a leading '1') @ANI =' 1 9876543210'(与上述相同,但前导'1')
No exact match found within CustDataByANI.cANI 在CustDataByANI.cANI中找不到完全匹配的项
Closest match in [cANI] is '9876543210' (still record 55555) [cANI]中最接近的匹配是'9876543210'(仍然记录55555)
Even a child would recognize that the only difference from Case 1 is the presence of a 1-digit 'prefix' in the parameter @ANI - perhaps it is a long distance 'tag' or a country code. 甚至一个孩子都会意识到,与案例1的唯一区别是参数@ANI中存在一个1位数的“前缀”-也许是长途“标签”或国家/地区代码。
Such prefixes could be 1 or 2 or even 3 digits in length....we can't predict. 这样的前缀长度可能是1或2或什至3位数。...我们无法预测。 We do not want to consider prefixes longer than 3, but in this case do want to return the values from record 55555 as in Case 1. 我们不想考虑大于3的前缀,但是在这种情况下,确实希望像情况1一样从记录55555返回值。

Case 3: the 'reverse' of Case 2 案例3:案例2的“反向”
@ANI = '9876543210' @ANI ='9876543210'
No exact match found within CustDataByANI.cANI 在CustDataByANI.cANI中找不到完全匹配的项
Closest match in [cANI] is ' 1 9876543210' (record # 55555 now has a '1' prefix) [cANI]中最接近的匹配是' 1 9876543210'(记录#55555现在具有'1'前缀)
Again, we would assume that the two are of substantial equivalence. 再一次,我们将假定两者是实质等同的。 In this case, the [cANI] value contains the longer sequence due to the prefix, which could be 1 or 2 or even 3 digits in length...we cannot predict. 在这种情况下,由于前缀,[cANI]值包含更长的序列,长度可能是1或2甚至是3位数...我们无法预测。 We do not want to consider prefixes longer than 3, but in this case do want to return the values from record 55555 as in Case 1. 我们不想考虑大于3的前缀,但是在这种情况下,确实希望像情况1一样从记录55555返回值。

Again, due to the possible variability in the length of each value (@ANI and [cANI]) and my near complete lack of SQL programming, I cannot write a SELECT statement for the stored procedure which takes into account all 3 cases. 同样,由于每个值(@ANI和[cANI])的长度可能存在变化,并且由于我几乎完全缺乏SQL编程,因此我无法为存储过程编写一个考虑了所有3种情况的SELECT语句。 Simple "LIKE" statements with wildcards seem to fail, and my head is spinning on CASE criteria, CONTAINS, and even REVERSE strategies for 'reading' both the @ANI and the cANI values in a right-to left fashion. 简单的带有通配符的“ LIKE”语句似乎失败了,我的头脑在旋转CASE准则,CONTAINS甚至REVERSE策略,以从右到左的方式“读取” @ANI和cANI值。

My dream is to return the best probable match between the two. 我的梦想是返回两者之间最好的匹配
My Stupid Procedure is below; 我的愚蠢程序如下; any and all assistance is greatly appreciated!. 任何帮助都将不胜感激!

BTW, my source table CustDataByANI does include a RevANI column, which is simply the cANI values in reverse. 顺便说一句,我的源表CustDataByANI确实包含一个RevANI列,它只是反向的cANI值。 Initially I had thought that the solution might lie in reversing both the @ANI parameter value and finding the greatest match within the [RevANI] column, thereby leaving any wildcards on the right of each. 最初,我以为解决方案可能是同时反转@ANI参数值并在[RevANI]列中找到最大匹配项,从而在每个变量的右侧保留任何通配符。 But I'm still stuck and am not sure if that is the best strategy.... 但是我仍然陷于困境,并且不确定这是否是最佳策略。

USE [GCC]
GO
/****** Object:  StoredProcedure [dbo].[SP_GetCustDataByANI]    Script Date: 10/07/2014 07:47:34 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

ALTER PROCEDURE [dbo].[SP_GetCustDataByANI] 
    @ANI varchar(80)
AS
BEGIN

    SET NOCOUNT ON;

--Remove leading zeros from the varchar @ANI. I chose this method rather than risking
--the undesirable introduction of exponential notation when long characters are converted to
--integers and back...

    IF ((LEN(@ANI) > 1) AND (LEFT(@ANI,1)= '0'))
    BEGIN
        SET @ANI = REPLACE(LTRIM(REPLACE(@ANI,'0',' ')),' ','0')
    END

    SELECT Id
        ,cANI
        ,cServiceClass
        ,cCompanyClass
        ,cContactName
        ,cContactDivision
        ,cContactDepartment
        ,cCompanyName
        ,cOrganizationName
        ,cContactCity
        ,cContactStateTerr
        ,cContactCountry
        ,cCompanyIsDistributor
        ,PrefAgentID
        ,PrefAgentID_SQUAL
        ,PrefRegionID_SQUAL
        ,VIP_CC
        ,VIP_TS
        ,TS_ACAT
    FROM [dbo].[CustDataByANI]
    WHERE ([cANI] = @ANI)

If you want to make the queries faster you can create an aditional column with the phone number in reverse order, make an index on this column, and then query for the number using a LIKE predicate, passing the searched for phone number also in reverse order. 如果要加快查询速度,可以创建一个带有电话号码相反顺序的附加列,在此列上创建索引,然后使用LIKE谓词查询该号码,同时以相反的顺序传递搜索到的电话号码。 This will give make the queries as fast as possible. 这将使查询尽可能快。 For example, for your sample data, you'd store this in the new ReversedPhoneNumber column: 例如,对于示例数据,可以将其存储在新的ReversedPhoneNumber列中:

6543211234     store as: 4321123456
16543211234    store as: 43211234561 
0016543211234  store as: 4321123456100

When you need to query by 6543211234 simply reverse it and look in the reversed column like this 当您需要通过6543211234查询时,只需对其进行反向查询, 6543211234在反向列中查找如下所示

WHERE ReversedPhoneNumber LIKE `6543211234%`

This would match the number in whichever format it's stored, and very quickly, because it's a simple and fast index seek. 这将与存储的任何格式的数字都匹配,并且匹配速度非常快,因为这是一种简单而快速的索引查找。 (A LIKE that resembles an "starts with" operation will seek the index to look for coincidences). (类似于“开始于”操作的LIKE会寻找索引来寻找巧合)。

As to the additional rules that you need to apply, you know the data better than us. 至于您需要应用的其他规则,您比我们更了解数据。 Simply think of all possible cases and make a few tests and you'll get the rules that you need to apply (not to speed up, bu tto ensure a correct match). 只需考虑所有可能的情况并进行一些测试,您就会获得需要应用的规则(不要加快速度,但要确保正确的匹配)。

You can reverse the phone number in the ETL process. 您可以在ETL过程中冲销电话号码。

There are missing details to give a better advice. 缺少详细信息可以提供更好的建议。

NOTE: if you cannot add columns and indexes to the existing tables, simply create a differente table to hold the reversed number, related to the existing table. 注意:如果无法将列和索引添加到现有表中,则只需创建一个差异表来保存与现有表相关的反向编号。 You can also add triggers to maintain this table 您还可以添加触发器来维护此表

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM