简体   繁体   中英

How to add a line break after a number?

I have limited Python knowledge, so I'm having a lot of trouble fixing this.

After extracting text from a pdf file and doing a small cleanup, I got the following result:

"BARRINE QLD 4872ARCHDALE VIC 3475ARCHDALE JUNCTION VIC 3475ARCHER NT 0830ARCHER RIVER QLD 4892" ( This is a small sample from a much larger result! )

Is there a way to add a break line after the numbers? So, instead of the string above, I'd have something similar to this:

'BARRINE  QLD 4872',  
'ARCHDALE  VIC 3475'

I tried reading different articles about this, but perhaps due to my lack of knowledge I simply can't figure it out!

This is not the most elegant solution, but something like this might work:

string = "BARRINE  QLD 4872ARCHDALE  VIC 3475ARCHDALE JUNCTION  VIC 3475ARCHER  NT 0830ARCHER RIVER  QLD 4892"

def split_at_numbers(string):
    char_at = 0
    temp_str = ""
    out = []
    while char_at < len(string):
        temp_str += string[char_at]
        try:
            if string[char_at].isnumeric() and not string[char_at + 1].isnumeric():
                out.append(temp_str)
                temp_str = ""
        except IndexError:
            out.append(temp_str)
        char_at += 1
    return out

print(split_at_numbers(string))

# output: ['BARRINE  QLD 4872', 'ARCHDALE  VIC 3475', 'ARCHDALE JUNCTION  VIC 3475', 'ARCHER  NT 0830', 'ARCHER RIVER  QLD 4892']

The loop above iterates over each character, and checks if the character is one) a number and two) not followed by a number. If those two conditions are true, then we break off that section and go into the next section of that string. We store each of those sections into a list which we return at the end.

From there, the data should be easy to work with.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM