简体   繁体   中英

Trying the limits of split command in python - list index out of range

I am new to programming and I tried to split an input string as follows Ex:

string = ['1981+198-19871*1981/555'] --> ['1981','198','19871','1981','555'] 

using two for cycles and I cannot understand why it returns me an error: 'list index out of range'

operatori = ["+","-","*","/"]
string = ['1981+198-19871*1981/555']

for operatore in operatori:
   for i in range(len(string)):
        string = string[i].split(operatore)
        print(operatore)

Don't reinvent the wheel. Let the standard library work for you:

Python 3.7.5 (default, Dec 15 2019, 17:54:26) 
[GCC 9.2.1 20190827 (Red Hat 9.2.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> re.split('\W+', '1981+198-19871*1981/555')
['1981', '198', '19871', '1981', '555']
>>>

You can even have anything but digits as the separator:

>>> re.split('\D+', '1981+198-19871*1981/555abc12')
['1981', '198', '19871', '1981', '555', '12']
>>> 

And, finally, if you just want to split on the operators + , * , / , and - , just do:

>>> re.split('[+*/-]', '1981+198-19871*1981/555abc12')
['1981', '198', '19871', '1981', '555abc12']
>>>

Here is two methods of how you can resolve your task.

First method without importing anything and the second using re module with a list of escaped operators:

import re

operators = ['+', '-', '*', '/']
strings = ['1981+198-19871*1981/555']


def split_string(data: list, operators: list):
    for elm in data:
        out = ''
        for k in elm:
            if k not in operators:
                out += k
            else:
                yield out
                out = ''
        if out:
            yield out  # yield the last element


def re_split_string(data: list, operators: list):
    for elm in data:
        escaped = ''.join([re.escape(operator) for operator in operators])
        if escaped:
            pattern = r'[{operators}]'.format(operators=escaped)
            yield from re.split(pattern, elm)
        else:
            yield elm


first = list(split_string(strings, operators))
print(first)
second = list(re_split_string(strings, operators))
print(second)

Output:

['1981', '198', '19871', '1981', '555']
['1981', '198', '19871', '1981', '555']

PS: If you want to see the performance of each method, let's for example use a big string strings = ['1981+198-19871*1981/555' * 1000]

Results in my machine:

In [1]: %timeit split_string(strings, operators)                                                                    
211 ns ± 0.509 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [2]: %timeit re_split_string(strings, operators)                                                                 
211 ns ± 0.49 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

As you can see, the two methods have nearly the same execution time.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM