[英]how to manipulate variables in a factor of a data frame
I need to do some manipulations in a factor inside my data frame with name phone number. 我需要在数据框内使用姓名电话号码的一个因素中进行一些操作。 the variables must be numeric with lenght 5 also not contains special char and I want to change the format AO-11111, VQ-11111from to 111111 it means erase the first chars and finally transform the rest of variables to na 变量必须是长度为5的数字,并且也不能包含特殊字符,我想将AO-11111,VQ-11111的格式从更改为111111,这意味着擦除第一个字符,最后将其余变量转换为na
My data.frame is derived from a .csv file.initial phone_number is a factor data such that phone_number VQ-40773 VQ-43685 VQ-44986 40270 41694 42623 . 我的data.frame是从.csv文件派生的.initial phone_number是一个因素数据,例如phone_number VQ-40773 VQ-43685 VQ-44986 40270 41694 42623。 . 。
strsplit function will help you to get the value out string. strsplit函数将帮助您从字符串中获取值。
str="VQ-40773"
(strsplit(str,"-"))[[1]][2] //will return 40773
If you want to remove anything the precedes a dash, then: 如果要删除破折号之前的所有内容,则:
sub("^([^-]+[-])(.+)", "\\2", phone_number)
> phone_number <- scan(what="")
1: VQ-40773
2: VQ-43685
3: VQ-44986
4: 40270
5: 41694
6: 42623
7:
Read 6 items
> sub("^([^-]+[-])(.+)", "\\2", phone_number)
[1] "40773" "43685" "44986" "40270" "41694" "42623"
> as.numeric(sub("^([^-]+[-])(.+)", "\\2", phone_number))
[1] 40773 43685 44986 40270 41694 42623
The nchar
function would allow checking the lengths of a character vector. nchar
函数将允许检查字符向量的长度。 Post an adequate example and, please, do make a greater effort to get punctuation and capitalization correct. 发布适当的示例,请尽最大努力使标点符号和大写字母正确无误。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.