简体   繁体   English

如何在数据帧中操作变量

[英]how to manipulate variables in a factor of a data frame

I need to do some manipulations in a factor inside my data frame with name phone number. 我需要在数据框内使用姓名电话号码的一个因素中进行一些操作。 the variables must be numeric with lenght 5 also not contains special char and I want to change the format AO-11111, VQ-11111from to 111111 it means erase the first chars and finally transform the rest of variables to na 变量必须是长度为5的数字,并且也不能包含特殊字符,我想将AO-11111,VQ-11111的格式从更改为111111,这意味着擦除第一个字符,最后将其余变量转换为na

My data.frame is derived from a .csv file.initial phone_number is a factor data such that phone_number VQ-40773 VQ-43685 VQ-44986 40270 41694 42623 . 我的data.frame是从.csv文件派生的.initial phone_number是一个因素数据,例如phone_number VQ-40773 VQ-43685 VQ-44986 40270 41694 42623。 .

strsplit function will help you to get the value out string. strsplit函数将帮助您从字符串中获取值。

 str="VQ-40773"
(strsplit(str,"-"))[[1]][2] //will return 40773

If you want to remove anything the precedes a dash, then: 如果要删除破折号之前的所有内容,则:

 sub("^([^-]+[-])(.+)", "\\2", phone_number)

> phone_number <- scan(what="")
1:     VQ-40773
2:     VQ-43685
3:     VQ-44986
4:     40270
5:     41694
6:     42623
7: 
Read 6 items
> sub("^([^-]+[-])(.+)", "\\2", phone_number)
[1] "40773" "43685" "44986" "40270" "41694" "42623"
> as.numeric(sub("^([^-]+[-])(.+)", "\\2", phone_number))
[1] 40773 43685 44986 40270 41694 42623

The nchar function would allow checking the lengths of a character vector. nchar函数将允许检查字符向量的长度。 Post an adequate example and, please, do make a greater effort to get punctuation and capitalization correct. 发布适当的示例,请尽最大努力使标点符号和大写字母正确无误。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM