如何删除也有单词的R列中的固定数字？

Question

I have a vector that has a series of numbers and words. 我有一个包含一系列数字和单词的向量。

df <- as.character(c(1234, "Other", 5678, "Abstain"))

I would like to remove the last two digits of the numbers without affecting the words in the string. 我想删除数字的最后两位而不影响字符串中的单词。

df <- as.character(c(12, "Other", 56, "Abstain"))

Answer 1

Probably a bit more robust/versatile/safe than the solution suggested by @r2evans in the comments. 可能比@ r2evans在评论中建议的解决方案更加健壮/多功能/安全。

gsub( "(\\d{2,})\\d{2}$", "\\1", df)

what it does: 它能做什么：

pattern = "(^\\\\d{2,})\\\\d{2}$" 模式 = "(^\\\\d{2,})\\\\d{2}$"

^ matches the start of the string ^匹配字符串的开头
\\\\d{2,} matches any substring of at least two digits (delete the comma of you only want to match strings of the exact length of 4 digits) \\\\d{2,}匹配至少两位数字的任何子字符串（删除您的逗号，只希望匹配长度为4位数字的字符串）
(^\\\\d{2,}) the round brackets define the start from the string and the following repetition of minimal two digits as a group. (^\\\\d{2,})圆括号定义了从字符串开始以及随后的重复的最少两位数字的组合。
\\\\d{2} a repetition of exactly two digits \\\\d{2}精确地重复两位数
$ matches the end of a string $匹配字符串的结尾

in short: it matches any string that exits solely of digits, that starts with a minimum of two digits, andd ends with two digits (so the minimum length of the digit string = 4) 简而言之：它匹配任何以数字结尾的字符串，该字符串以至少两位数字开头，以d结尾两位数字（因此，数字字符串的最小长度= 4）

replacement = "\\\\1" 替换 = "\\\\1"

replaces the entire matches string woth the first defind group ( (^\\\\d{2,}) ) from the above described pattern. 替换上述模式中第一个定义组（ (^\\\\d{2,}) ）中的整个匹配字符串。

sample data 样本数据

df <- c(123, "Other", 5678, "Abstain", "b12345", 123456, "123aa345")

gsub("(^\\d{2,})\\d{2}$", "\\1", df)
#[1] "123"      "Other"    "56"       "Abstain"  "b12345"   "1234"     "123aa345"

如何删除也有单词的R列中的固定数字？

问题描述

1 个解决方案

解决方案1
1 已采纳 2018-10-08 17:54:35

sample data 样本数据

如何删除也有单词的R列中的固定数字？

问题描述

1 个解决方案

解决方案1 1 已采纳 2018-10-08 17:54:35

sample data 样本数据

解决方案1
1 已采纳 2018-10-08 17:54:35