[英]How to select only numbers from a dataframe in R using which()
I have a large dataframe in R and am trying to do some stats tests on certain columns, but the non-programmers who made the csv file added a bunch of text notes that I need to ignore.我在 R 中有一个大的 dataframe 并且正在尝试对某些列进行一些统计测试,但是制作 csv 文件的非程序员添加了一堆我需要忽略的文本。
For example a column might have values: 12,20,40,missing,64,32,no input,45,10例如,一列可能有值:12,20,40,missing,64,32,no input,45,10
How do I only select the numbers using the which statement?我如何只使用 which 语句 select 的数字? I failed miserably trying: my_data_frame$Column.Title[which(is.numeric(my_data_frame$Column.Title))]
我惨遭失败: my_data_frame$Column.Title[which(is.numeric(my_data_frame$Column.Title))]
What do I change in the which function to only select the numbers and ignore the text?我应该将 which function 更改为仅 select 的数字并忽略文本? Thanks!
谢谢!
You can use the built-in as.numeric()
converter to do something like this:您可以使用内置的
as.numeric()
转换器执行以下操作:
x <- my_data_frame$Column.Title
xn <- as.numeric(x)
which(!is.na(xn))
This won't distinguish between NA
s created by failed coercion and pre-existing (numeric) NA
values.这不会区分由失败的强制创建的
NA
和预先存在的(数字) NA
值。
If there's a small enough variety of "missing" values you could read the data in with read.csv(..., na.strings=c("NA","missing","no input"))
如果“缺失”值的种类足够少,您可以使用
read.csv(..., na.strings=c("NA","missing","no input"))
读取数据
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.