Extract words from string before new line

Question

I recently asked the question how to extract words from string before number, to help me sort some data. This works perfectly until there is no number in front and only a new line.

This was done by codenewbie

import re

strings = '''
Hi my name is hazza 50 test test test

Hi hazza 60 test test test

hazza 50 test test test
'''

for s in strings.split('\n'):
    if s != '':
        print(re.findall('(.+?)\d',s)[0])

This gives

Hi my name is hazza 
Hi hazza 
hazza

Which is perfect but fails if a string has no number in front but a new line

import re

strings = '''
Hi my name is hazza 50 test test test

Hi hazza 60 test test test

hazza 50 test test test

hazza hazza test test test
'''

for s in strings.split('\n'):
    if s != '':
        print(re.findall('(.+?)\d',s)[0])

I need it to give me

Hi my name is hazza 
Hi hazza 
hazza 
hazza hazza

I have tried

import re

strings = '''
Hi my name is hazza 50 test test test

Hi hazza 60 test test test

hazza 50 test test test

hazza hazza
test test test
'''

    while True:
            try:
                for s in strings.split('\n'):
                    if s != '':
                        print(re.findall('(.+?)\d',s)[0])
            except IndexError:
                print(s.split('/n'))

But not completely sure where to put the break in and if there is a better way

Any help would be greatly appreciated

Edit:

I have these stings for example

Hi my name is hazza 50 test test test

Hi hazza 60 test test test

hazza 50 test test test

hazza hazza
test test test

The code done by codenewbie works fine for the first three strings but not the last.

I need the last to look like

Hi my name is hazza 
Hi hazza 
hazza 
hazza hazza

Answer 1

You can use re.match() [^\d]* to match any non-digit characters:

import re

strings = '''
Hi my name is hazza 50 test test test

Hi hazza 60 test test test

hazza 50 test test test

hazza hazza test test test
'''

for s in strings.splitlines():
    if s != '':
        print(re.match(r'[^\d]*',s)[0])

Prints:

Hi my name is hazza 
Hi hazza 
hazza 
hazza hazza test test test

EDIT: Based on the comments, the new version:

import re

strings = '''Hi my name is hazza 50 test test test

Hi hazza 60 test test test

hazza 50 test test test

hazza hazza
test test test
'''

for s in re.findall(r'(.*?)(?:\n\n|\n$)', strings, flags=re.S):
    print(re.match(r'(.*?)(?=\d|\n)', s)[0])

Prints:

Hi my name is hazza 
Hi hazza 
hazza 
hazza hazza

Extract words from string before new line

Question

1 answers

solution1
0 ACCPTED 2020-06-20 12:10:13

Extract words from string before new line

Question

1 answers

solution1 0 ACCPTED 2020-06-20 12:10:13

solution1
0 ACCPTED 2020-06-20 12:10:13