在长度超过 n 个字符的单词之间包含一个空格

Question

我有一个字符向量。

x <- c('This is a simple text', 'this is a veryyyyyyyyyy long word', 'Replacethis andalsothis')

我想在长度超过n字符的单词之间插入一个空格。 对于这个例子，我们可以考虑n = 10 。 我更喜欢regex解决方案，但如果您认为还有其他选择，我不介意尝试。

我正在寻找的输出 -

c('This is a simple text', 'this is a veryyyyyyy yyy long word', 'Replacethi s andalsothi s')

我已经尝试通过对我的数据进行必要的更改来使用这篇文章中的解决方案，但它没有提供所需的输出。

sub('(.{10})(?=\\S)//g', '\\1 ', x, perl = TRUE)
#[1] "This is a simple text"           "this is a veryyyyyyyy long word" "Replacethis andalsothis"

Answer 1

您可以使用

gsub("\\b(\\w{10})\\B", "\\1 ", x) # If your words only consist of letters/digits/_
gsub("(?<!\\S)(\\S{10})(?=\\S)", "\\1 ", x, perl=TRUE) # If the "words" are non-whitespace char chunks

请参阅正则表达式演示和此正则表达式演示，以及R 演示：

x <- c('This is a simple text', 'this is a veryyyyyyyyyy long word', 'Replacethis andalsothis')
gsub("\\b(\\w{10})\\B", "\\1 ", x)
# => [1] "This is a simple text" "this is a veryyyyyyy yyy long word" "Replacethi s andalsothi s"

x <- c("this is a veryyyyyyy|yyy long word")
gsub("(?<!\\S)(\\S{10})(?=\\S)", "\\1 ", x, perl=TRUE)
# => [1] "this is a veryyyyyyy |yyy long word"

正则表达式匹配...

\\b - 单词边界
(\\w{10}) - 十个字字符
\\B - 仅当另一个单词 char 出现在右侧时（因此，第十个单词 char 不是单词的结束字符）。

和

(?<!\\S) - 字符串开头或空格之后的位置
(\\S{10}) - 第 1 组：十个非空白字符
(?=\\S) - 紧靠右侧，必须有一个非空白字符。

在长度超过 n 个字符的单词之间包含一个空格

问题描述

1 个解决方案

解决方案1
2 已采纳 2021-10-27 08:14:49

在长度超过 n 个字符的单词之间包含一个空格

问题描述

1 个解决方案

解决方案1 2 已采纳 2021-10-27 08:14:49

解决方案1
2 已采纳 2021-10-27 08:14:49