[英]Having difficulty using rle command within a mutate step in r to count the max number of consecutive characters in a word
我創建了這個 function 來計算一個單詞中連續字符的最大數量。
max(rle(unlist(strsplit("happy", split = "")))$lengths)
function 適用於單個單詞,但當我嘗試在變異步驟中使用 function 時,它不起作用。 這是涉及 mutate 步驟的代碼。
text3 <- "The most pressing of those issues, considering the franchise's
stated goal of competing for championships above all else, is an apparent
disconnect between Lakers vice president of basketball operations and general manager"
text3_df <- tibble(line = 1:1, text3)
text3_df %>%
unnest_tokens(word, text3) %>%
mutate(
num_letters = nchar(word),
num_vowels = get_count(word),
num_consec_char = max(rle(unlist(strsplit(word, split = "")))$lengths)
)
變量 num_letters 和 num_vowels 工作正常,但我為 num_consec_char 的每個值得到 2。 我不知道我做錯了什么。
此命令rle(unlist(strsplit(word, split = "")))$lengths
未矢量化,因此對每一行的整個單詞列表進行操作,因此每一行的結果相同。
您將需要使用某種類型的循環(即for
、 apply
、 purrr::map
)來解決它。
library(dplyr)
library(tidytext)
text3 <- "The most pressing of those issues, considering the franchise's
stated goal of competing for championships above all else, is an apparent
disconnect between Lakers vice president of basketball operations and general manager"
text3_df <- tibble(line = 1:1, text3)
output<- text3_df %>%
unnest_tokens(word, text3) %>%
mutate(
num_letters = nchar(word),
# num_vowels = get_count(word),
)
output$num_consec_char<- sapply(output$word, function(word){
max(rle(unlist(strsplit(word, split = "")))$lengths)
})
output
# A tibble: 32 × 4
line word num_letters num_consec_char
<int> <chr> <int> <int>
1 1 the 3 1
2 1 most 4 1
3 1 pressing 8 2
4 1 of 2 1
5 1 those 5 1
6 1 issues 6 2
7 1 considering 11 1
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.