簡體   English   中英

如何用 R 中的間隔端點的平均值替換字符串中的數字間隔?

[英]How do I replace numeric interval in string with a mean of interval endpoints in R?

樣本數據:

df <- data.frame(A = c("bought, 2.500-2.700,- bar, 1000",
                       "545,-kc, barista 3600-4600kc sells",
                       "about  3-4 thousands",
                       "sold 2.000-3.000,-, table"))

df
      A
[,1]  bought, 2.500-2.700,- bar, 1000
[,2]  545,-kc, barista 3600-4600kc sells
[,3]  about 3-4 thousands
[,3]  sold 2.000-3.000,-, table

我想用端點的平均值替換間隔。 所需的 output 如下所示:

      A
[,1]  bought, 2.600,- bar, 1000
[,2]  545,-kc, barista 4100kc sells
[,3]  about 3,5 thousands
[,3]  sold 2.500,-, table

你會怎么做?

library(dplyr)
library(stringr)
library(magrittr)

repl <- df$A %>%
  str_extract_all("\\d*\\.?\\d+-\\d*\\.?\\d+") %>%
  str_split("-") %>%
  as.data.frame() %>%
  mutate_all(as.character) %>%
  mutate_all(as.numeric) %>%
  summarise_all(mean) %>%
  mutate_all(as.character) %>%
  unlist()

df$A %<>% str_replace("\\d*\\.?\\d+-\\d*\\.?\\d+", repl)

df

Output:

                              A
1       bought, 2.6,- bar, 1000
2 545,-kc, barista 4100kc sells
3          about  3.5 thousands
4             sold 2.5,-, table

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM