如果没有空格后跟大写字母，请删除空格

Question

I have data which looks like this: 我有看起来像这样的数据：

*first*               *last*
M a rk                Twain
Hun ter               Stockt on Thompson

The data then continues for n amount of rows. 然后，数据继续进行n行。 So I want the data to look like this: 所以我希望数据看起来像这样：

*first*               *last*
Mark                  Twain
Hunter                Stockton Thompson

I know I can use gsub to remove all blankspaces like this: 我知道我可以使用gsub删除所有这样的空格：

gsub(" ", "", x, fixed = TRUE)

And I can identify the pattern with a regex like this: 而且我可以使用正则表达式来识别模式，如下所示：

( [AZ])

But how can I combine these two to say to gsub - remove all spaces but not in the cases where it matches the regex? 但是我如何结合这两个来对gsub说-删除所有空格，但在匹配正则表达式的情况下不删除？

Answer 1

Simplest way: 最简单的方法：

txt <- c("M a rk", "Twain", "Hun ter", "Stockt on Thompson")
gsub("\\s([a-z])", "\\1", txt)
## [1] "Mark"              "Twain"             "Hunter"            "Stockton Thompson"

If you want to apply this to more than one variable in a data.frame, you can do it using lapply and the list addressing replacement function for a data.frame. 如果要将其应用于data.frame中的多个变量，则可以使用lapply和data.frame的列表寻址替换功能来实现。 (Note: You really should not use asterisks in the names of data.frame columns.) （注意：您确实不应在data.frame列的名称中使用星号。）

df <- data.frame("*first*" = c("M a rk", "Hun ter"),
                 "*last*" = c("Twain", "Stockt on Thompson"),
                 check.names = FALSE, stringsAsFactors = FALSE)

# names of the text columns you want to clean up
varsToModify <- c("*first*", "*last*")

df[varsToModify] <- lapply(df[varsToModify], 
                           function(x) gsub("\\s([a-z])", "\\1", x))
df
##   *first*            *last*
## 1    Mark             Twain
## 2  Hunter Stockton Thompson

Answer 2

df <- data.frame(`*first*`=c('M a rk','Hun ter'),`*last*`=c('Twain','Stockt on Thompson'),check.names=F,stringsAsFactors=F);
df;
##   *first*             *last*
## 1  M a rk              Twain
## 2 Hun ter Stockt on Thompson

I would use a Perl negative lookahead assertion: 我将使用Perl否定超前断言：

for (ci in seq_along(df)) df[[ci]] <- gsub(perl=T,' (?![A-Z])','',df[[ci]]);
df;
##   *first*            *last*
## 1    Mark             Twain
## 2  Hunter Stockton Thompson

See Regular Expressions as used in R . 请参阅R中使用的正则表达式。 The discussion of Perl assertions is given near the bottom of the page. 在页面底部附近给出了对Perl断言的讨论。

如果没有空格后跟大写字母，请删除空格

问题描述

2 个解决方案

解决方案1
1 2016-04-19 16:23:41

解决方案2
0 已采纳 2016-04-19 16:21:16

如果没有空格后跟大写字母，请删除空格

问题描述

2 个解决方案

解决方案1 1 2016-04-19 16:23:41

解决方案2 0 已采纳 2016-04-19 16:21:16

解决方案1
1 2016-04-19 16:23:41

解决方案2
0 已采纳 2016-04-19 16:21:16