This is my code so far:
for element in address1:
z = re.match("^\d+$", element)
if z:
get_best_fuzzy("1 DEEPALI", address1)
In the above code, I am trying to get the matching addresses in the text file. I would like to get the exact match for house number with approximate match with residual say 80%. But, the above code is not giving me any output nor any error.
Below is the sample for my addresses:
002 TOWER NO. 7 UNIWORLD GARDEN SEC. 47 SOWA ROAD GURGAON Haryana 122001 India
002 TOWER NO. 7 UNIWORLD GARDEN SECTOR-47 SONA ROAD GURGAON Haryana 122001 India
09;SHIVALIK BUNGLAOW; ANANDNAGAR CROSS ROAD; NEAR MADHUR HALL;SATELLITE;
AHMEDABAD Gujarat 380015 India
1 DEEPALI; PITAMPURA DELHI Delhi 110034 India
10; BRIGHTON TOWERS; CROSS ROAD NO.2; LOKHANDWALA COMPLEX; ANDHERI WEST MUMBAI Maharashtra 400053 India
100 Vaishali; Pitampura Delhi Delhi 110034 India
100 Vaishali; Pitampura; DELHI Delhi 110034 India
Please be explanatory as I am new to this.
^
: asserts position at the start of a line
\\d
: matches a digit
+
: matches between one to unlimited times
$
: asserts position at the end of a line
So your regex string ^\\d+$
would only match 1
or 100
, etc exactly, with no additional characters after it.
To get exact match on the house number, try ^\\d+
instead
>>> import re
>>> element = "1 DEEPALI"
>>> z = re.match('^\d+', element)
>>> z
<_sre.SRE_Match object; span=(0, 1), match='1'>
>>> z.group(0)
'1'
>>> if z:
... print('A match is found!')
...
A match is found!
You can test your regex out using online regex generators like this : https://regex101.com/
I'm not sure what your function get_best_fuzzy
does. The error could be arising from there.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.