I am quite new to R and am working now with a script that was done by me and my supervisor. Unfortunately I am unable to reuse one instance of gsub() for names of my samples. The previous version looked like this (Anterior and Posterior varied throughout the df):
"1: Anterior LN_60_026.fcs"
and was taken apart using
cell.counts$EH_ID <- gsub("\\d+: (Anterior|Posterior) LN_(\\d{2})_\\d{3}.fcs", "LM02\\2", cell.counts$Sample)
cell.counts$Position <- gsub("\\d+: (Anterior|Posterior) LN_(\\d{2})_\\d{3}.fcs", "\\1", cell.counts$Sample)
Now I am faced with a similar problem which requires some minor adjustment. Because I don't know how gsub() syntax works I am stuck with:
"1: mLN_681_030.fcs"
for which mLN and spleen vary throughout the df and the code that I tried to adapt doesn't work anymore:
cells$Mouse_ID <- gsub("\\d+: (mLN|spleen)(_\\d{2})_\\d{3}_\\.fcs", "AA_0\\2", cells$Sample)
cells$tissue <- gsub("\\d+: (mLN|spleen)_(\\d{3})_\\d{3}.fcs", "\\1", cells$Sample)
I should add that the "tissue" separation works, it's sample number extraction that doesn't. If anyone could explain to me what I am doing wrong and what the characters in this code do specifically, I'd be very grateful. PS: Yes I have used?gsub but I find the help files in R quite beginner unfriendly and didn't understand much.
You are expecting exactly 2 digits in the second capture group in your mouse ID line and you have a trailing underscore before your filename.
Also in the second regex you have not escaped the .
which still works because an un-escaped .
will match any character but should be \\.
as below.
# > str <- "1: mLN_681_030.fcs"
# > gsub(str, pattern="\\d+: (mLN|spleen)(_\\d{3})_\\d{3}\\.fcs", replacement = "AA_0\\2")
# [1] "AA_0_681"
# > gsub(str, pattern = "\\d+: (mLN|spleen)_(\\d{3})_\\d{3}\\.fcs", replacement = "\\1")
# [1] "mLN"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.