简体   繁体   English

R:从因子向量中删除Na并将其转换为数值

[英]R: remove Na from factor vector and convert it to numeric

I have a vector like this 我有一个这样的载体

 [1] "72.82947"  NA          NA          NA          NA          NA          "66.00949"  NA         
  [9] NA          "0.133434"  NA          NA          "2.265083"  NA          NA          NA         
 [17] " 0"        NA          NA          NA          NA          NA          NA          NA         
 [25] "0.311346"  NA          NA          " 0"        NA          NA          NA          NA         
 [33] NA          NA          NA          NA          NA          "0.7024582" NA          NA         
 [41] NA          NA          NA          NA          NA          NA          "3.543211"  NA         
 [49] NA          "5.779669"  NA          "4.617021"  NA          "1.682751"  NA          NA         
 [57] NA          NA          NA          "0.214977"  NA          NA          NA          "1.573152" 

Following many previous questions ( How to remove all the NA from a Vector? , R script - removing NA values from a vector , R: removing NAs in numerical vectors ) and manuals I used 以下是先前的许多问题( 如何从向量中删除所有NA?R脚本-从向量中删除NA值R:从数值向量中删除NA )和我使用的手册

vector.test[!is.na(exo.1.4.mad)]

and

vector.test[na.omit(exo.1.4.mad)]

But none of them works. 但是它们都不起作用。 I always get back the same vector with NA. 我总是用NA返回相同的向量。 Then I tried to subset the vector manually, indicating the position where I have values and I tried to convert it in numeric values: 然后,我尝试手动对向量进行子集化,以指示具有值的位置,并尝试将其转换为数值:

as.numeric(as.character(exo.1.4.mad.values))

But also this does not work, and NAs are introduced by coercion. 但这也行不通,NA是强制引入的。 At this point I think I'm missing something concerning the formatting/class of my original vector. 在这一点上,我认为我缺少有关原始矢量的格式/类的信息。

Any suggestion? 有什么建议吗?


I add some more information for my object: 我为对象添加了更多信息:

typeof(exo.1.4.mad) 1 "integer" typeof(exo.1.4.mad) 1个 “整数”

dput(exo.1.4.mad) structure(c(33L, 37L, 37L, 37L, 37L, 37L, 31L, 37L, 37L, 4L, 37L, 37L, 20L, 37L, 37L, 37L, 1L, 37L, 37L, 37L, 37L, 37L, 37L, 37L, 8L, 37L, 37L, 1L, 37L, 37L, 37L, 37L, 37L, 37L, 37L, 37L, 37L, 11L, 37L, 37L, 37L, 37L, 37L, 37L, 37L, 37L, 24L, 37L, 37L, 29L, 37L, 26L, 37L, 19L, 37L, 37L, 37L, 37L, 37L, 6L, 37L, 37L, 37L, 18L, 37L, 2L, 37L, 1L, 37L, 14L, 37L, 25L, 37L, 27L, 37L, 10L, 37L, 3L, 37L, 37L, 35L, 37L, 37L, 28L, 37L, 37L, 37L, 32L, 37L, 12L, 37L, 30L, 37L, 37L, 37L, 37L, 37L, 36L, 37L, 37L, 7L, 37L, 13L, 37L, 37L, 37L, 37L, 9L, 37L, 37L, 37L, 21L, 37L, 37L, 37L, 37L, 37L, 37L, 15L, 37L, 37L, 37L, 34L, 37L, 23L, 37L, 37L, 37L, 37L, 37L, 22L, 37L, 37L, 37L, 16L, 37L, 37L, 17L, 37L, 5L, 37L), .Label = c("\\" 0\\"", "\\"0.044478\\"", "\\"0.1103672\\"", "\\"0.133434\\"", "\\"0.1893487\\"", "\\"0.214977\\"", "\\"0.2506812\\"", "\\"0.311346\\"", "\\"0.3219932\\"", "\\"0.409485\\"", "\\"0.7024582\\"", "\\"0.7029872\\"", "\\"0.7983231\\"", "\\"1.104537\\"", "\\"1.170474\\"", "\\"1.2355\\"", "\\"1.2556 dput(exo.1.4.mad)结构(c(33L,37L,37L,37L,37L,37L,31L,37L,37L,4L,37L,37L,20L,37L,37L,37L,1L,37L,37L, 37L,37L,37L,37L,37L,8L,37L,37L,1L,37L,37L,37L,37L,37L,37L,37L,37L,37L,11L,37L,37L,37L,37L,37L,37L, 37L,37L,24L,37L,37L,29L,37L,26L,37L,19L,37L,37L,37L,37L,37L,6L,37L,37L,37L,18L,37L,2L,37L,1L,37L, 14L,37L,25L,37L,27L,37L,10L,37L,3L,37L,37L,35L,37L,37L,28L,37L,37L,37L,32L,37L,12L,37L,30L,37L,37L, 37L,37L,37L,36L,37L,37L,7L,37L,13L,37L,37L,37L,37L,9L,37L,37L,37L,21L,37L,37L,37L,37L,37L,37L,15L, 37L,37L,37L,34L,37L,23L,37L,37L,37L,37L,37L,22L,37L,37L,37L,16L,37L,37L,17L,37L,5L,37L),. Label = c( “ \\” 0 \\“”,“ \\” 0.044478 \\“”,“ \\” 0.1103672 \\“”,“ \\” 0.133434 \\“”,“ \\” 0.1893487 \\“”,“ \\” 0.214977 \\“”,“ \\ “ 0.2506812 \\”“,” \\“ 0.311346 \\”“,” \\“ 0.3219932 \\”“,” \\“ 0.409485 \\”“,” \\“ 0.7024582 \\”“,” \\“ 0.7029872 \\”“,” \\“ 0.7983231 \\“”,“ \\” 1.104537 \\“”,“ \\” 1.170474 \\“”,“ \\” 1.2355 \\“”,“ \\” 1.2556 81\\"", "\\"1.573152\\"", "\\"1.682751\\"", "\\"2.265083\\"", "\\"2.491765\\"", "\\"2.566038\\"", "\\"2.731105\\"", "\\"3.543211\\"", "\\"4.42271\\"", "\\"4.617021\\"", "\\"5.235322\\"", "\\"5.340412\\"", "\\"5.779669\\"", "\\"5.847934\\"", "\\"66.00949\\"", "\\"67.9525\\"", "\\"72.82947\\"", "\\"75.2123\\"", "\\"8.347973\\"", "\\"9.832462\\"", "NA"), class = "factor") 81 \\“”,“ \\” 1.573152 \\“”,“ \\” 1.682751 \\“”,“ \\” 2.265083 \\“”,“ \\” 2.491765 \\“”,“ \\” 2.566038 \\“”,“ \\” 2.731105 \\ “”,“ \\” 3.543211 \\“”,“ \\” 4.42271 \\“”,“ \\” 4.617021 \\“”,“ \\” 5.235322 \\“”,“ \\” 5.340412 \\“”,“ \\” 5.779669 \\“ ,“ \\” 5.847934 \\“”,“ \\” 66.00949 \\“”,“ \\” 67.9525 \\“”,“ \\” 72.82947 \\“”,“ \\” 75.2123 \\“”,“ \\” 8.347973 \\“”,“ \\“ 9.832462 \\”“,” NA“),类=” factor“)

this confuses me even more! 这让我更加困惑!

Try: 尝试:

exo1 <- as.numeric(gsub("[^.0-9]+","",exo.1.4.mad))
exo1[!is.na(exo1)]
 #[1] 72.8294700 66.0094900  0.1334340  2.2650830  0.0000000  0.3113460
 #[7]  0.0000000  0.7024582  3.5432110  5.7796690  4.6170210  1.6827510
 #[13]  0.2149770  1.5731520  0.0444780  0.0000000  1.1045370  4.4227100
 #[19]  5.2353220  0.4094850  0.1103672  8.3479730  5.3404120 67.9525000
 #[25]  0.7029872  5.8479340  9.8324620  0.2506812  0.7983231  0.3219932
 #[31]  2.4917650  1.1704740 75.2123000  2.7311050  2.5660380  1.2355000
 #[37]  1.2556810  0.1893487

Explanation 说明

 [^.0-9]+ ## select everything else other than digits and dot and remove it.

Here is something that works for me : 这对我有用:

> myVec <- c(NA, "1", "2", NA)
> myVec
[1] NA  "1" "2" NA 
> as.numeric(myVec[!is.na(myVec)])
[1] 1 2

Does this help you ? 这对您有帮助吗?

The problem with your data is that your "NA"s are not realy NA s as R defines them, but just characters. 数据的问题在于,您的“ NA”不是真正的NA因为R定义了它们,而只是字符。 Thus is.na won't work here. 因此is.na在这里不起作用。 Simply do 简单地做

exo.1.4.mad[exo.1.4.mad != "NA"]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM