R中的polr（..）序數邏輯回歸

Question

使用polr函數時遇到了一些麻煩。

這是我擁有的數據的子集：

# response variable
rep = factor(c(0.00, 0.04, 0.06, 0.13, 0.15, 0.05, 0.07, 0.00, 0.06, 0.04, 0.05, 0.00, 0.92, 0.95, 0.95, 1, 0.97, 0.06, 0.06, 0.03, 0.03, 0.08, 0.07, 0.04, 0.08, 0.03, 0.07, 0.05, 0.05, 0.06, 0.04, 0.04, 0.08, 0.04, 0.04, 0.04, 0.97, 0.03, 0.04, 0.02, 0.04, 0.01, 0.06, 0.06, 0.07, 0.08, 0.05, 0.03, 0.06,0.03))
# "rep" is discrete variable which represents proportion so that it varies between 0 and 1
# It is discrete proportions because it is the proportion of TRUE over a finite list of TRUE/FALSE. example: if the list has 3 arguments, the proportions value can only be 0,1/3,2/3 or 1

# predicted variable
set.seed(10)
pred.1 = sample(x=rep(1:5,10),size=50)
pred.2 = sample(x=rep(c('a','b','c','d','e'),10),size=50)
# "pred" are discrete variables 

# polr
polr(rep~pred.1+pred.2)

我給您的子集效果很好！ 但是我的整個數據集及其某些子集無法正常工作！ 除了數量，我在數據中找不到與該子集不同的任何內容。 所以，這是我的問題：例如，在級別數方面是否存在任何限制，這會導致以下錯誤消息：

Error in optim(s0, fmin, gmin, method = "BFGS", ...) : 
  the initial value in 'vmin' is not finite

和通知消息：

   glm.fit: fitted probabilities numerically 0 or 1 occurred

（我不得不將這兩個消息翻譯成英文，所以它們可能不是100％正確的）

有時我只會收到通知消息，有時一切都很好，這取決於我使用的數據子集是什么。

我的rep變量總共有101個信息級別（除了我描述的數據種類外，沒有其他內容）

所以這是一個可怕的問題，因為我無法提供完整的數據集，也不知道問題出在哪里。 通過這些信息，您能否猜出我的問題出在哪里？

謝謝

Answer 1

遵循@joran的建議，即您的問題可能是100級因素，我將向您推薦一些在統計上可能無效但在您的特定情況下仍將有效的方法：完全不要使用邏輯回歸。 放下 執行簡單的線性回歸，然后根據需要使用專門的舍入程序離散化輸出。 試一試，看看它對您的效果如何。

rep.v = c(0.00, 0.04, 0.06, 0.13, 0.15, 0.05, 0.07, 0.00, 0.06, 0.04, 0.05, 0.00, 0.92, 0.95, 0.95, 1, 0.97, 0.06, 0.06, 0.03, 0.03, 0.08, 0.07, 0.04, 0.08, 0.03, 0.07, 0.05, 0.05, 0.06, 0.04, 0.04, 0.08, 0.04, 0.04, 0.04, 0.97, 0.03, 0.04, 0.02, 0.04, 0.01, 0.06, 0.06, 0.07, 0.08, 0.05, 0.03, 0.06,0.03)

set.seed(10)
pred.1 = factor(sample(x=rep(1:5,10),size=50))
pred.2 = factor(sample(x=rep(c('a','b','c','d','e'),10),size=50))

model = lm(rep.v~as.factor(pred.1) + as.factor(pred.2))
output = predict(model, newx=data.frame(pred.1, pred.2))

# Here's one way you could accomplish the discretization/rounding
f.levels = unique(rep.v)
rounded = sapply(output, function(x){ 
  d = abs(f.levels-x)
  f.levels[d==min(d)]
  }
)

>rounded

   1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16   17   18   19   20   21   22   23   24 
0.06 0.07 0.00 0.06 0.15 0.00 0.07 0.00 0.13 0.06 0.06 0.15 0.15 0.92 0.15 0.92 0.15 0.15 0.06 0.06 0.00 0.07 0.15 0.15 
  25   26   27   28   29   30   31   32   33   34   35   36   37   38   39   40   41   42   43   44   45   46   47   48 
0.15 0.15 0.00 0.00 0.15 0.00 0.15 0.15 0.07 0.15 0.00 0.07 0.15 0.00 0.15 0.15 0.00 0.15 0.15 0.15 0.92 0.15 0.15 0.00 
  49   50 
0.13 0.15

Answer 2

rms orm可以處理大量類別的有序結果。

library(rms)
orm(rep ~ pred.1 + pred.2)

R中的polr（..）序數邏輯回歸

問題描述

2 個解決方案

解決方案1
0 2013-07-24 16:35:53

解決方案2
0 2016-03-14 08:51:35

R中的polr（..）序數邏輯回歸

問題描述

2 個解決方案

解決方案1 0 2013-07-24 16:35:53

解決方案2 0 2016-03-14 08:51:35

解決方案1
0 2013-07-24 16:35:53

解決方案2
0 2016-03-14 08:51:35