[英]Fuzzy match with Regular Expression
This is my code so far: 到目前为止,这是我的代码:
for element in address1:
z = re.match("^\d+$", element)
if z:
get_best_fuzzy("1 DEEPALI", address1)
In the above code, I am trying to get the matching addresses in the text file. 在上面的代码中,我试图在文本文件中获取匹配的地址。 I would like to get the exact match for house number with approximate match with residual say 80%. 我想获得房屋号的确切匹配,而剩余匹配为80%。 But, the above code is not giving me any output nor any error. 但是,上面的代码没有给我任何输出,也没有任何错误。
Below is the sample for my addresses: 以下是我的地址示例:
002 TOWER NO. 7 UNIWORLD GARDEN SEC. 47 SOWA ROAD GURGAON Haryana 122001 India
002 TOWER NO. 7 UNIWORLD GARDEN SECTOR-47 SONA ROAD GURGAON Haryana 122001 India
09;SHIVALIK BUNGLAOW; ANANDNAGAR CROSS ROAD; NEAR MADHUR HALL;SATELLITE;
AHMEDABAD Gujarat 380015 India
1 DEEPALI; PITAMPURA DELHI Delhi 110034 India
10; BRIGHTON TOWERS; CROSS ROAD NO.2; LOKHANDWALA COMPLEX; ANDHERI WEST MUMBAI Maharashtra 400053 India
100 Vaishali; Pitampura Delhi Delhi 110034 India
100 Vaishali; Pitampura; DELHI Delhi 110034 India
Please be explanatory as I am new to this. 请说明一下,因为我是新手。
^
: asserts position at the start of a line ^
:在行首声明位置
\\d
: matches a digit \\d
:匹配一个数字
+
: matches between one to unlimited times +
:匹配一次到无限次
$
: asserts position at the end of a line $
:在行尾声明位置
So your regex string ^\\d+$
would only match 1
or 100
, etc exactly, with no additional characters after it. 因此,您的正则表达式字符串^\\d+$
只能精确匹配1
或100
,以此类推,后面没有其他字符。
To get exact match on the house number, try ^\\d+
instead 要获得与门牌号完全匹配的信息,请改用^\\d+
>>> import re
>>> element = "1 DEEPALI"
>>> z = re.match('^\d+', element)
>>> z
<_sre.SRE_Match object; span=(0, 1), match='1'>
>>> z.group(0)
'1'
>>> if z:
... print('A match is found!')
...
A match is found!
You can test your regex out using online regex generators like this : https://regex101.com/ 您可以使用以下在线正则表达式生成器来测试正则表达式: https : //regex101.com/
I'm not sure what your function get_best_fuzzy
does. 我不确定您的函数get_best_fuzzy
做什么。 The error could be arising from there. 该错误可能是从那里引起的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.