Extracting parts of character string with gsub

Question

I am quite new to R and am working now with a script that was done by me and my supervisor. Unfortunately I am unable to reuse one instance of gsub() for names of my samples. The previous version looked like this (Anterior and Posterior varied throughout the df):

"1: Anterior LN_60_026.fcs"

and was taken apart using

cell.counts$EH_ID <- gsub("\\d+: (Anterior|Posterior) LN_(\\d{2})_\\d{3}.fcs", "LM02\\2", cell.counts$Sample)
cell.counts$Position <- gsub("\\d+: (Anterior|Posterior) LN_(\\d{2})_\\d{3}.fcs", "\\1", cell.counts$Sample)

Now I am faced with a similar problem which requires some minor adjustment. Because I don't know how gsub() syntax works I am stuck with:

"1: mLN_681_030.fcs"

for which mLN and spleen vary throughout the df and the code that I tried to adapt doesn't work anymore:

cells$Mouse_ID <- gsub("\\d+: (mLN|spleen)(_\\d{2})_\\d{3}_\\.fcs", "AA_0\\2", cells$Sample)
cells$tissue <- gsub("\\d+: (mLN|spleen)_(\\d{3})_\\d{3}.fcs", "\\1", cells$Sample)

I should add that the "tissue" separation works, it's sample number extraction that doesn't. If anyone could explain to me what I am doing wrong and what the characters in this code do specifically, I'd be very grateful. PS: Yes I have used?gsub but I find the help files in R quite beginner unfriendly and didn't understand much.

Answer 1

You are expecting exactly 2 digits in the second capture group in your mouse ID line and you have a trailing underscore before your filename.

Also in the second regex you have not escaped the . which still works because an un-escaped . will match any character but should be \\. as below.

# > str <- "1: mLN_681_030.fcs"
# > gsub(str, pattern="\\d+: (mLN|spleen)(_\\d{3})_\\d{3}\\.fcs", replacement = "AA_0\\2")
# [1] "AA_0_681"
# > gsub(str, pattern = "\\d+: (mLN|spleen)_(\\d{3})_\\d{3}\\.fcs", replacement = "\\1")
# [1] "mLN"

Extracting parts of character string with gsub

Question

1 answers

solution1
0 ACCPTED 2019-09-30 16:32:00

Extracting parts of character string with gsub

Question

1 answers

solution1 0 ACCPTED 2019-09-30 16:32:00

solution1
0 ACCPTED 2019-09-30 16:32:00