Python string split on multiple characters

Question

df = pd.DataFrame({'columnA': ['apple:50-100(+)', 'peach:75-125(-)', 'banana:100-150(+)']})

New to regular expressions...if I want to split 'apple:50-100(+)' (and other example strings above) into a DataFrame as below, what's the best way to do that?

Desired output:

Answer 1

I can update the regex if you provide more details on the format.

import pandas as pd

df = pd.DataFrame({'columnA': ['apple:50-100(+)', 'peach:75-125(-)', 'banana:100-150(+)']})

pattern = r"(.*):(\d+)-(\d+)\(([+-])\)"

new_df = df['columnA'].str.extract(pattern)

df :

             columnA
0    apple:50-100(+)
1    peach:75-125(-)
2  banana:100-150(+)

new_df :

        0    1    2  3
0   apple   50  100  +
1   peach   75  125  -
2  banana  100  150  +

Answer 2

re.split can be used to split on any string that matches a pattern. For the example you have given the following should work

re.split(r'[\:\-\(\)]+', your_string)

It splits the string on all colons, hyphens and parenthesis (":", "-", "(" and ")")

This results in an empty string as the last member of the list, you can either slice this off

re.split(r'[\:\-\(\)]+', your_string)[:-1]

Or filter out empty values

filter(None, re.split(r'[\:\-\(\)]+', your_string))

Answer 3

Here is an alternative:

Python 3.7.5 (default, Oct 17 2019, 12:16:48) 
[GCC 9.2.1 20190827 (Red Hat 9.2.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> import pandas as pd
>>> split_it = re.compile(r'(\w+):(\d+)[-](\d+)\((.)\)')
>>> df = pd.DataFrame(split_it.findall('apple:50-100(+)'))
>>> df
       0   1    2  3
0  apple  50  100  +
>>>

Python string split on multiple characters

Question

3 answers

solution1
4 ACCPTED 2019-12-10 03:18:09

solution2
0 2019-12-10 03:00:08

solution3
0 2019-12-10 03:03:45

Python string split on multiple characters

Question

3 answers

solution1 4 ACCPTED 2019-12-10 03:18:09

solution2 0 2019-12-10 03:00:08

solution3 0 2019-12-10 03:03:45

solution1
4 ACCPTED 2019-12-10 03:18:09

solution2
0 2019-12-10 03:00:08

solution3
0 2019-12-10 03:03:45