Splitting a single column in r into 4 separate columns

Question

I am in need of splitting a single data frame column (ID) into five separate columns (A, B, C, D, E). The columns need to be split by:

A - First Letter
B - All numbers until the second letter
C - All letters until the last letter
D - Last number
E - Last letter

Heres an example:

Before

ID Conc
1 A01HGF1a  132
2 D02SDV2b  453

After

A B C D E Conc
1 A 01 HGF 1 a  132
2 D 02 SDV 2 b  453

I have attempted to use separate() from tidyr but cannot figure out how to utilize regex properly. Any help is much appreciated!

Here is what I have attempted thus far

`separate(df, ID, into = c("A", "B", "C", "D","E"), sep = "(^.)(\\d+)(\\S+)(\\d+)(\\S+)")`

Answer 1

You could use sub here for a base R option:

df$A <- sub("^(\\w).*", "\\1", df$ID)
df$B <- sub("^\\w(\\d+).*", "\\1", df$ID)
df$C <- sub("^\\w\\d+(\\D+).*", "\\1", df$ID)
df$D <- sub(".*?(\\d+)\\D+$", "\\1", df$ID)
df$E <- sub(".*?(\\D+)$", "\\1", df$ID)
df

        ID Conc A  B   C D E
1 A01HGF1a  132 A 01 HGF 1 a
2 D02SDV2b  453 D 02 SDV 2 b

Splitting a single column in r into 4 separate columns

Question

1 answers

solution1
0 ACCPTED 2018-01-17 04:57:09

Demo

Splitting a single column in r into 4 separate columns

Question

1 answers

solution1 0 ACCPTED 2018-01-17 04:57:09

Demo

solution1
0 ACCPTED 2018-01-17 04:57:09