R-基於數據幀列內的部分匹配進行多次搜索和替換

Question

我有一個看起來像這樣的發布者列表：

+--------------+
|  Site Name   |
+--------------+
| Radium One   |
| Euronews     |
| EUROSPORT    |
| WIRED        |
| RadiumOne    |
| Eurosport FR |
| Wired US     |
| Eurosport    |
| EuroNews     |
| Wired        |
+--------------+

我想創建以下結果：

+--------------+----------------+
|  Site Name   | Publisher Name |
+--------------+----------------+
| Radium One   | RadiumOne      |
| Euronews     | Euronews       |
| EUROSPORT    | Eurosport      |
| WIRED        | Wired          |
| RadiumOne    | RadiumOne      |
| Eurosport FR | Eurosport      |
| Wired US     | Wired          |
| Eurosport    | Eurosport      |
| EuroNews     | Euronews       |
| Wired        | Wired          |
+--------------+----------------+

我想了解如何復制在Power Query中使用的這段代碼：

搜索前4個字符

如果Text.Start（[Site Name]，4）=“ WIRE”，則為“ Wired”否則

搜索最后3個字符

如果Text.End（[Site Name]，3）=“一個”，則“ RadiumOne”，否則

如果找不到匹配項，則添加“ Rest”

它不必區分大小寫。

Answer 1

使用properCase包和gsub ifultools ，我們用“”替換第一個單詞之后的所有內容，即刪除它並分別對待Radium的特殊情況。 如果您有Radium case之類的例外情況，請使用這些例外情況更新您的帖子，以便我們可以找到更巧妙的解決方案：)

library("ifultools")

siteName=c("Radium One","Euronews","EUROSPORT","WIRED","RadiumOne","Eurosport FR","Wired US","Eurosport","EuroNews","Wired")

publisherName = gsub("^Radium$","Radiumone",gsub("\\s+.*","",properCase(siteName)))

 # [1] "Radiumone" "Euronews"  "Eurosport" "Wired"     "Radiumone" "Eurosport" "Wired"    
 # [8] "Eurosport" "Euronews"  "Wired"

R-基於數據幀列內的部分匹配進行多次搜索和替換

問題描述

1 個解決方案

解決方案1
0 2016-11-04 12:26:36

R-基於數據幀列內的部分匹配進行多次搜索和替換

問題描述

1 個解決方案

解決方案1 0 2016-11-04 12:26:36

解決方案1
0 2016-11-04 12:26:36