簡體   English   中英

將整個行替換為字符串,並將部分行替換為字符串

[英]Replace entire line with string and replace part of a line with a string

我正在嘗試清理以下數據集,以在“更改”字段中保持一定的一致性。

輸入:

test_data <- data.frame(ID=c('john@xxx.com', 'sally@xxx.com'),
                        Changes=c('3 max cost changes
  productxyz > pb100  > a : Max cost decreased from $0.98 to $0.83
  productxyz > pb2  > a : Max cost decreased from $1.07 to $0.91
  productxyz > pb2  > b : Max cost decreased from $0.65 to $0.55', 
                                  '2 max cost changes
  productabc > Everything else in "auto & truck maintenance" : Max CPC increased from $0.81 to $0.97
  productabc > pb1000  > x : Max cost decreased from $1.44 to $1.22
  productabc > pb10000  > Everything else in pb10000 : Max CPC increased from $0.63 to $0.76'), stringsAsFactors=FALSE)
  1. 我要刪除給定字段中的所有行,其中第一個“>”后跟“所有”。我想刪除整行。

  2. 對於第二個“>”之后出現“一切”的情況,我想用“ q”從“一切”替換為“:”

輸出:

out_data <- data.frame(ID=c('john@xxx.com', 'sally@xxx.com'),
                        Changes=c('3 max cost changes
  productxyz > pb100  > a : Max cost decreased from $0.98 to $0.83
  productxyz > pb2  > a : Max cost decreased from $1.07 to $0.91
  productxyz > pb2  > b : Max cost decreased from $0.65 to $0.55', 
                                  '2 max cost changes
  productabc > pb1000  > x : Max cost decreased from $1.44 to $1.22
  productabc > pb10000  > q : Max CPC increased from $0.63 to $0.76'), stringsAsFactors=FALSE)

謝謝。

也許不是最好的解決方案,但是它可以在test_data中獲得test_data

clean_text <- function(x){
  x <- gsub("(> .* > )Everything else in .* :", "\\1 q :", x)
  x <- gsub("\n .* Everything else in .*?\n", "", x)
  x
}
out_data <- test_data
out_data[,2] <- clean_text(test_data[,2])
out_data
             ID
1  john@xxx.com
2 sally@xxx.com
                                                                                                                                                                                                                                                                                                                     Changes
1 3 max cost changes\n                                  productxyz > pb100  > a : Max cost decreased from $0.98 to $0.83\n                                  productxyz > pb2  > a : Max cost decreased from $1.07 to $0.91\n                                  productxyz > pb2  > b : Max cost decreased from $0.65 to $0.55
2                                                                                                2 max cost changes                                  productabc > pb1000  > x : Max cost decreased from $1.44 to $1.22\n                                  productabc > pb10000  >  q : Max CPC increased from $0.63 to $0.76

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM