简体   繁体   English

从R中的文本字符串创建新类别

[英]Creating new categories from a text string in R

I have a column with the Following rows: 我有一列包含以下行:

Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access
Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access
Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access
Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access
Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access
Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access
Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access
Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access
Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: No home phone calls or line rental|NET: Internet access

I want to create a new column based on the values found: ie 我想基于找到的值创建一个新列:即

if Telephone line rental is found, then in the new column, I want to code as V
if Fixed broadband, then code as B
if Mobile phone = M
if Paid for TV service / TV and/or sport code as T

So if we take the first rows example: 因此,如果我们以第一行为例:

Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access

The category will be: V, B, M The NET: XXX|NET: XXXX parts of the string needs to be ignored. 类别为: V, B, M NET:XXX | NET:XXXX字符串的一部分需要忽略。

the full portfolio can be any combination of those 4 but they must be in the following order V, B, M, T 完整的投资组合可以是这4个的任意组合,但必须按以下顺序排列: V, B, M, T

I have been googling and reading through library(stringr) 我一直在谷歌搜索并阅读library(stringr)

Tried splitting up the string with sep = "\\\\|" 尝试使用sep = "\\\\|"拆分字符串 but it is not working. 但它不起作用。

Any other ideas? 还有其他想法吗?

Regards, 问候,

DPUT: DPUT:

c("Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: No home phone calls or line rental|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Mobile phone|Paid for TV service|NET: No home phone calls|NET: No home phone calls or line rental|NET: No internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Mobile phone|Paid for TV service|NET: No home phone calls|NET: No home phone calls or line rental|NET: No internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Mobile phone|Paid for TV service|NET: No home phone calls|NET: No home phone calls or line rental|NET: No internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: No home phone calls or line rental|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Fixed broadband|Mobile phone|NET: No home phone calls|NET: No home phone calls or line rental|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Mobile phone|NET: No home phone calls|NET: No internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: No home phone calls or line rental|NET: Internet access", 
"Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: No home phone calls or line rental|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Mobile phone|NET: No home phone calls|NET: No internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: No home phone calls or line rental|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Mobile phone|Paid for TV service|NET: No home phone calls|NET: No internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access", 
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access"
)

You could use grepl like this: 您可以像这样使用grepl

df <- read.table(text='"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access"
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access"
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access"
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access"
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access"
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access"
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access"
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access"
"Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: No home phone calls or line rental|NET: Internet access"',
header=FALSE,stringsAsFactors=FALSE)

tv <- c("Paid for TV service","TV","sport code")

df$new_col <- paste(ifelse(grepl("Telephone line rental",df$V1),"V",""),
ifelse(grepl("Fixed broadband",df$V1),"B",""),
ifelse(grepl("Mobile phone",df$V1),"M",""),
ifelse(grepl(paste(tv,collapse = "|"), df$V1),"T","")
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM