R - gsub 特定位置的特定字符

Question

I would like to delete the last character of a variable.我想删除变量的最后一个字符。 I was wondering if it is possible to select the position with gsub and delete the character at this particular position.我想知道是否可以使用gsub选择位置并删除该特定位置的字符。

In this example, I want to delete the last digit in the end, after the E , for my 4 variables.在这个例子中，我想删除最后一个数字，在E ，我的 4 个变量。

variables = c('B10243E1', 'B10243E2', 'B10243E3', 'B10243E4')
gsub(pattern = '[[:xdigit:]]{8}.', replacement = '', x = variables)

I thought we could use the command我以为我们可以使用命令

{}

in order to select a specific position.为了选择一个特定的位置。

Answer 1

You can do it by capturing all the characters but the last:您可以通过捕获除最后一个字符之外的所有字符来实现：

variables = c('B10243E1', 'B10243E2', 'B10243E3', 'B10243E4')
gsub('^(.*).$', '\\1', variables)

Explanation:解释：

^ - Start of the string ^ - 字符串的开始
(.*) - All characters but a newline up to (.*) - 除换行符外的所有字符
.$ - The last character (captured with . ) before the end of string ( $ ). .$ - 字符串 ( $ ) 结尾之前的最后一个字符（用.捕获）。

Thus, this regex is good to use if you plan to remove the final character, and the string does not contain newline.因此，如果您打算删除最后一个字符，并且字符串不包含换行符，则可以使用此正则表达式。

See demo看演示

Output:输出：

[1] "B10243E" "B10243E" "B10243E" "B10243E"

To only replace the 8th character (here is a sample where I added T at the end of each item):仅替换第 8 个字符（这是我在每个项目末尾添加T的示例）：

variables = c('B10247E1T', 'B10243E2T', 'B10243E3T', 'B10243E4T')
gsub('^(.{7}).', '\\1', variables)

Output of the sample program (not ET at the end of each item, the digit was removed):示例程序的输出（不是每项末尾的ET ，数字已被删除）：

[1] "B10247ET" "B10243ET" "B10243ET" "B10243ET"

Answer 2

Try any of these.尝试其中任何一个。 The first removes the last character, the second replaces E and anything after it with E, the third returns the first 7 characters assuming there are 8 characters, the remaining each return the first 7 characters.第一个删除最后一个字符，第二个用 E 替换 E 及其后的任何字符，第三个返回前 7 个字符，假设有 8 个字符，其余每个返回前 7 个字符。 All are vectorized, ie variables may be a vector of character strings as in the question.所有都是向量化的，即variables可能是问题中的字符串向量。

sub(".$", "", variables)

sub("E.*", "E", variables)

sub("^(.{7}).", "\\1", variables)

sub("^(.{7}).*", "\\1", variables)

substr(variables, 1, 7)

substring(variables, 1, 7)

trimws("abc333", "right", "\\d") # requires R 3.6 (currently r-devel)

Here is a visualization of the regular expression in the third solution:这是第三个解决方案中正则表达式的可视化：

^(.{7}).

正则表达式可视化

Debuggex Demo调试器演示

and there is a visualization of the regular expression in the fourth solution:并且在第四个解决方案中有一个正则表达式的可视化：

^(.{7}).*

正则表达式可视化

Debuggex Demo调试器演示

Answer 3

If you always want to remove after E you can capture everything after it and replace by E如果你总是想在E之后删除你可以捕获它之后的所有内容并用E替换

sub("E(.*)", 'E', variables)
## [1] "B10243E" "B10243E" "B10243E" "B10243E"

Alternatively, you can count 7 characters using positive look behind and remove everything after或者，您可以使用正面向后查看 7 个字符，然后删除所有内容

sub("(?<=.{7})(.)", "", variables, perl = TRUE)
## [1] "B10243E" "B10243E" "B10243E" "B10243E"

Answer 4

library(stringr)
str_sub("your String", 1, -2)

maybe slower than the other ones, but a lot easier to read.也许比其他的慢，但更容易阅读。

Answer 5

You can also use str_sub from stringr package.您还可以使用str_sub从stringr包。

library(stringr)
variables = c('B10243E1', 'B10243E2', 'B10243E3', 'B10243E4')
variables = str_sub (variables, start = 1, end = -2)

Output:输出：

> variables
[1] "B10243E" "B10243E" "B10243E" "B10243E"

R - gsub 特定位置的特定字符

问题描述

5 个解决方案

解决方案1
4 已采纳 2015-05-10 12:40:21

解决方案2
4 2015-05-10 12:40:54

解决方案3
1 2015-05-10 12:39:43

解决方案4
1 2016-09-10 14:39:48

解决方案5
0 2021-02-26 18:26:19

R - gsub 特定位置的特定字符

问题描述

5 个解决方案

解决方案1 4 已采纳 2015-05-10 12:40:21

解决方案2 4 2015-05-10 12:40:54

解决方案3 1 2015-05-10 12:39:43

解决方案4 1 2016-09-10 14:39:48

解决方案5 0 2021-02-26 18:26:19

解决方案1
4 已采纳 2015-05-10 12:40:21

解决方案2
4 2015-05-10 12:40:54

解决方案3
1 2015-05-10 12:39:43

解决方案4
1 2016-09-10 14:39:48

解决方案5
0 2021-02-26 18:26:19