简体   繁体   中英

Removing a character from within a vector element

I have a vector of strings:

str.vect<-c ("abcR.1", "abcL.1", "abcR.2", "abcL.2")

 str.vect
[1] "abcR.1" "abcL.1" "abcR.2" "abcL.2"

How can I remove the third character from the right in each vector element?

Here is the desired result:

"abc.1" "abc.1" "abc.2" "abc.2"

Thank you very much in advance

You can use nchar to find the length of each element of the vector

> nchar(str.vect)
[1] 6 6 6 6

Then you combine this with strtrim to get the beginning of each string

> strtrim(str.vect, nchar(str.vect)-3)
[1] "abc" "abc" "abc" "abc"

To get the end of the word you can then use substr (actually, you could use substr to get the beginning too...)

> substr(str.vect, nchar(str.vect)-1, nchar(str.vect))
[1] ".1" ".1" ".2" ".2"

And finally you use paste0 (which is paste with sep="" ) to stick them together

> paste0(strtrim(str.vect, nchar(str.vect)-3), # Beginning
         substr(str.vect, nchar(str.vect)-1, nchar(str.vect))) # End
[1] "abc.1" "abc.1" "abc.2" "abc.2"

There are easier ways if you know your strings have some special characteristics

For instance, if the length is always 6 you can directly substitute the nchar calls with the appropriate value.


EDIT : alternatively, R also supports regular expressions, which make this task much easier.

> gsub(".(..)$", "\\1", str.vect)
[1] "abc.1" "abc.1" "abc.2" "abc.2"

The syntax is a bit more obscure, but not that difficult once you know what you are looking at.

The first parameter ( ".(..)$" ) is what you want to match

. matches any character, $ denotes the end of the string. So ...$ indicates the last 3 characters in the string.

We put the last two in parenthesis, so that we can store them in memory.

The second parameter tells us what you want to substitute the matched substring with. In our case we put \\\\1 which means "whatever was in the first pair of parenthesis".

So essentially this command means: "find the last three characters in the string and change them with the last two".

The solution provided by @nico seems fine, but a simpler alternative might be to use sub :

sub('.(.{2})$', '\\1', str.vect)

This searches for the pattern of: "any character (represented by . ) followed by 2 of any character (represented by .{2} ), followed by the end of the string (represented by $ )". By wrapping the .{2} in parentheses, R captures whatever those last two characters were. The second argument is the string to replace the matched substrings with. In this case, we refer to the first string captured in the matched pattern. This is represented by \\\\1 . (If you captured multiple parts of the pattern, with multiple sets of parentheses, you would refer to subsequent captured regions with, eg \\\\2 , \\\\3 , etc.)

str.vect<-c ("abcR.1", "abcL.1", "abcR.2", "abcL.2")

a <- strsplit(str.vect,split="")

a <- strsplit(str.vect,split="")
b <- unlist(lapply(a,FUN=function(x) {x[4] <- ""
                          paste(x,collapse="")}
                          ))

If you want to parameterize it further change 4 to a variable and put the index of the character you want to remove there.

Not sure how general or efficient this is, but it seems to work with your example string:

(This seems very similar to nico's answer although I am not using the strtrim function.)

my.string <- c("abcR.1", "abcL.1", "abcR.2", "abcL.2")

n.char <- nchar(my.string)
the.beginning <- substr(my.string, n.char-(n.char-1), n.char-3)
the.end <- substr(my.string, n.char-1, n.char)

new.string <- paste0(the.beginning, the.end)
new.string

# [1] "abc.1" "abc.1" "abc.2" "abc.2"

删除每个元素右侧的第3个字符。

sapply(str.vec, function(x)  gsub(substr(x, nchar(x)-2,nchar(x)-2), "", x))

This is a very quick and dirty answer, but thats what is needed sometimes:

 #Define vector
 str.vect  <- c("abcR.1", "abcL.1", "abcR.2", "abcL.2")

 #Use gsub to remove both 'R' and 'L' independently.  
 str.vect2 <- gsub("R", '', str.vect )
 str.vect_final <- gsub("L", '', str.vect2 )

 >str.vect_final
 [1] "abc.1" "abc.1" "abc.2" "abc.2"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM