How to re.sub european street adress to remove letters/digits

Question

I'm working with a df where streets must be transformed into the following:

Streets:

A. Jakšto g. 2
Stumbrų g. 26A
M. K. Paco g. 19
Birželio 23-iosios g. 15
Grigiškių m. Kovo 11-osios g. 43
Laisvės pr. 87

Would need to be transformed to:

A. Jakšto g.
Stumbrų g.
M. K. Paco g.
Birželio 23-iosios g.
Grigiškių m. Kovo 11-osios g.
Laisvės pr

.

Yes, I know this isn't a place where someone could do the work for me - everything is Google'able, but I'm feeling really stuck here, even while reading the documentations ^^

Answer 1

Looks like you need to remove the last string in your text.

Demo:

s = """A. Jakšto g. 2
Stumbrų g. 26A
M. K. Paco g. 19
Birželio 23-iosios g. 15
Grigiškių m. Kovo 11-osios g. 43
Laisvės pr. 87"""

print("\n".join(" ".join(i.split()[:-1]) for i in s.splitlines()))

Output:

A. Jakšto g.
Stumbrų g.
M. K. Paco g.
Birželio 23-iosios g.
Grigiškių m. Kovo 11-osios g.
Laisvės pr.

Or using Regex.

Ex:

import re  

s = """A. Jakšto g. 2A
Stumbrų g. 26A
M. K. Paco g. 19
Birželio 23-iosios g. 15
Grigiškių m. Kovo 11-osios g. 43
Laisvės pr. 87"""

print(re.sub(r"(\d+[A-Za-z]?)$", "", s, flags=re.M))

How to re.sub european street adress to remove letters/digits

Question

1 answers

solution1
1 ACCPTED 2020-03-13 11:26:27

How to re.sub european street adress to remove letters/digits

Question

1 answers

solution1 1 ACCPTED 2020-03-13 11:26:27

solution1
1 ACCPTED 2020-03-13 11:26:27