如何將變量轉化為定量？

Question

我有一個數據矩陣（900 列和 5000 行），我想對它做一個 pca。

該矩陣在 excel 中看起來非常好（意味着所有值都是定量的），但是在我在 R 中讀取我的文件並嘗試運行 pca 代碼后，我收到一條錯誤消息，指出“以下變量不是定量的”，我得到一個非定量變量列表。

所以一般來說，有些變量是定量的，有些則不是。 請參閱以下示例。 當我檢查變量 1 時，它是正確的和定量的..（隨機某些變量在文件中是定量的）在文件中）

> data$variable1[1:5]
[1] -0.7617504 -0.9740939 -0.5089303 -0.1032487 -0.1245882

> data$variable2[1:5]
[1] -0.183546332959017 -0.179283451229594 -0.191165669598284 -0.187060515423038
[5] -0.184409474669824
731 Levels: -0.001841783473108 -0.001855956210119 ... -1,97E+05

所以我的問題是，如何將所有非定量變量更改為定量變量？

使文件簡短無濟於事，因為值會自行量化。 我不知道發生了什么。 所以這里是我的原始文件的鏈接 <- https://docs.google.com/file/d/0BzP-YLnUNCdwakc4dnhYdEpudjQ/edit

我也嘗試了下面給出的答案，但它仍然沒有幫助。

所以讓我展示一下我到底做了什么，

> data <- read.delim("file.txt", header=T)
> res.pca = PCA(data, quali.sup=1, graph=T)
Error in PCA(data, quali.sup = 1, graph = T) :
The following variables are not quantitative:  batch
The following variables are not quantitative:  target79
The following variables are not quantitative:  target148
The following variables are not quantitative:  target151
The following variables are not quantitative:  target217
The following variables are not quantitative:  target266
The following variables are not quantitative:  target515
The following variables are not quantitative:  target530
The following variables are not quantitative:  target587
The following variables are not quantitative:  target620
The following variables are not quantitative:  target730
The following variables are not quantitative:  target739
The following variables are not quantitative:  target801
The following variables are not quantitative:  target803
The following variables are not quantitative:  target809
The following variables are not quantitative:  target819
The following variables are not quantitative:  target868
The following variables a
In addition: There were 50 or more warnings (use warnings() to see the first 50)

Answer 1

正如 Arun 所提到的，R 將您的變量視為因素。 因此它制作了一個 data.frame （實際上是一個列表）。 有很多方法可以解決這個問題，一種是通過以下方式將其轉換為數據矩陣；

matrix <- as.numeric(as.matrix(data))
dim(matrix) <- dim(data)

現在您可以在矩陣上運行您的 PCA。

編輯：

稍微擴展一下示例，查理建議的第二部分將不起作用。 復制以下會話，看看它是如何工作的；

d <- data.frame(
 a = factor(runif(2000)),
 b = factor(runif(2000)),
 c = factor(runif(2000)))

as.numeric(d) #does not work on a list (data frame is a list)

as.numeric(d$a) # does work, because d$a is a vecor, but this is not what you are 
# after. R converts the factor levels to numeric instead of the actual value.

(m <- as.numeric(as.matrix(d))) # this does the rigth thing
dim(m)                        # but m loses the dimensions and is now a vector

dim(m) <- dim(d)              # assign the dimensions of d to m

svd(m)                        # you can do the PCA function of your liking on m

Answer 2

默認情況下，R 將字符串強制轉換為因子。 這可能會導致意外行為。 使用以下命令關閉此默認選項：

      read.csv(x, stringsAsFactors=F)

或者，您可以將因子強制轉換為數字

      newVar<-as.numeric(oldVar)

Answer 3

as.numeric(as.character(data$variable2[1:5])) ，首先使用as.character獲取因子變量標簽的字符串表示，然后使用as.numeric進行轉換

如何將變量轉化為定量？

問題描述

3 個解決方案

解決方案1
0 2013-02-28 11:07:21

解決方案2
0 2013-02-28 11:18:56

解決方案3
0 2021-04-04 18:06:11

如何將變量轉化為定量？

問題描述

3 個解決方案

解決方案1 0 2013-02-28 11:07:21

解決方案2 0 2013-02-28 11:18:56

解決方案3 0 2021-04-04 18:06:11

解決方案1
0 2013-02-28 11:07:21

解決方案2
0 2013-02-28 11:18:56

解決方案3
0 2021-04-04 18:06:11