In python or R I want a more efficient way to string split a text in a column into four columns

Question

I have a column named BREADS with 5 rows, I want to split the column and values into 4 columns namely B , REA , D and S .


BREADS
>2319-22-<21
>1513-16-<19
>1319-25-<22
>1617-21-<25
>1011-15-<17

Desired outcome


B, REA , D, S    ### column names
>23 , 19-22 , - , <21
>15 , 13-16 , - , <19
>13 , 19-25 , - , <22
>16 , 17-21 , - , <25
>10 , 11-15 , - , <17

# Key: > greater than and < less than, - hyphen in the column 'D'

My attempt

###### in python
# for column 'B'
df['B'] = df['BREADS'].astype(str).str[0:4]   # returns '>23','>15',.....,'>10'


#### in R 

library(stringr)
str_split_fixed(df$BREADS, "", 2)

Answer 1

An option with extract from tidyr in R

library(dplyr)
library(tidyr)
df1 %>% 
 extract(BREADS, into = c('B', 'REA', 'D', 'S'),
        '^(\\>..)(\\d{2}-\\d{2})(-)(.*)')

-output

#  B   REA D   S
#1 >23 19-22 - <21
#2 >15 13-16 - <19
#3 >13 19-25 - <22
#4 >16 17-21 - <25
#5 >10 11-15 - <17

data

df1 <- structure(list(BREADS = c(">2319-22-<21", ">1513-16-<19", ">1319-25-<22", 
">1617-21-<25", ">1011-15-<17")), class = "data.frame", row.names = c(NA, 
-5L))

Answer 2

For Python:

d={'B': (0,4), 'REA':(3,8), 'D':(8,9), 'S':(9:20)}
for i in d:
    df[i]=df['BREADS'].apply(lambda x: x[d[i][0]:d[i][1])

Answer 3

You can use pandas str.extract to pull the data into separate columns; the assumption here is that the data is uniform for each row:

pattern = r"(?P<B>>.{2})(?P<REA>.{2}-.{2})(?P<D>-)(?P<S><.{2})"

df.BREADS.str.extract(pattern)

      B  REA    D    S
0   >23 19-22   -   <21
1   >15 13-16   -   <19
2   >13 19-25   -   <22
3   >16 17-21   -   <25
4   >10 11-15   -   <17

In python or R I want a more efficient way to string split a text in a column into four columns

Question

3 answers

solution1
2 2020-12-02 20:30:59

data

solution2
2 ACCPTED 2020-12-02 20:44:06

solution3
1 2020-12-02 21:00:59

In python or R I want a more efficient way to string split a text in a column into four columns

Question

3 answers

solution1 2 2020-12-02 20:30:59

data

solution2 2 ACCPTED 2020-12-02 20:44:06

solution3 1 2020-12-02 21:00:59

solution1
2 2020-12-02 20:30:59

solution2
2 ACCPTED 2020-12-02 20:44:06

solution3
1 2020-12-02 21:00:59