lm.fit 中的錯誤（x，y，offset = offset，singular.ok =singular.ok，...）：'y' 中的 NA/NaN/Inf，嘗試了所有可能的方法

Question

在這里，我的數據集是pd ，我已將其拆分為訓練和測試數據，分別為pd_train1和pd_train2

    sku national_inv lead_time in_transit_qty forecast_3_month forecast_6_month
1 3921548            8        12              0                0                0
2 3191009           83         2             33              157              377
3 2935810            8         4              0                0                0
4 2205847           31         4             63               70              160
5 4953497            3        12              0                0                0
6 2286884            0         8              0                0                0
  forecast_9_month sales_1_month sales_3_month sales_6_month sales_9_month min_bank
1                0             1             1             2             5        2
2              603            44            98           148           156       53
3                0             0             0             1             1        0
4              223            27            90           164           219        0
5                0             0             0             0             0        0
6                0             0             0             0             0        0
  potential_issue pieces_past_due perf_6_month_avg perf_12_month_avg local_bo_qty
1               0               0             0.63              0.75            0
2               0               0             0.68              0.66            0
3               0               0             0.73              0.78            0
4               0               0             0.73              0.78            0
5               0               0             0.81              0.74            0
6               0               0             0.91              0.96            0
  deck_risk oe_constraint ppap_risk stop_auto_buy rev_stop went_on_backorder  data
1         0             0         0             1        0                No train
2         0             0         0             1        0                No train
3         0             0         0             1        0                No train
4         0             0         1             1        0                No train
5         0             0         0             1        0                No train
6         0             0         0             1        0                No train

我想為我的訓練數據pd_train1創建一個 lm model 但我收到如下錯誤：

> fit=lm(went_on_backorder~.,data=pd_train1)
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
  NA/NaN/Inf in 'y'
In addition: Warning message:
In storage.mode(v) <- "double" : NAs introduced by coercion

我嘗試搜索無限值：

sapply(pd_train1, function(x) sum(is.infinite(x)))
             sku      national_inv         lead_time    in_transit_qty  forecast_3_month 
                0                 0                 0                 0                 0 
 forecast_6_month  forecast_9_month     sales_1_month     sales_3_month     sales_6_month 
                0                 0                 0                 0                 0 
    sales_9_month          min_bank   potential_issue   pieces_past_due  perf_6_month_avg 
                0                 0                 0                 0                 0 
perf_12_month_avg      local_bo_qty         deck_risk     oe_constraint         ppap_risk 
                0                 0                 0                 0                 0 
    stop_auto_buy          rev_stop went_on_backorder              data 
                0                 0                 0                 0

以及我想要在其上制作線性 model 的訓練數據中的 NA/NaN 值

     sku      national_inv         lead_time    in_transit_qty  forecast_3_month 
                0                 0                 0                 0                 0 
 forecast_6_month  forecast_9_month     sales_1_month     sales_3_month     sales_6_month 
                0                 0                 0                 0                 0 
    sales_9_month          min_bank   potential_issue   pieces_past_due  perf_6_month_avg 
                0                 0                 0                 0                 0 
perf_12_month_avg      local_bo_qty         deck_risk     oe_constraint         ppap_risk 
                0                 0                 0                 0                 0 
    stop_auto_buy          rev_stop went_on_backorder 
                0                 0                 0 


Inf %in% pd_train1$went_on_backorder
1] FALSE

NaN %in% pd_test$went_on_backorder
1] FALSE

從此以后我無法在我的數據集中獲得 NA/NaN/Inf 值有人可以幫我理解為什么會引發錯誤嗎？ 這里went_on_backorder是我的目標變量。

Answer 1

went_on_backorder不是數字變量。 lm無法處理非數字因變量。 查看邏輯回歸。

Answer 2

went_on_backorder列是一個因素。 線性回歸需要一個數值響應變量。

要使用邏輯回歸，請在基礎 R 或 package 中使用glm ，例如vgam 。 這是一個簡短的例子：

pd_train1 <- data.frame('went_on_backorder' = c('No','Yes','Yes'), 'lead_time' = 1:3)
model <- glm(went_on_backorder ~ ., data = pd_train1, family = 'binomial')

你可以預測你的課程：

predict(model, newdata = data.frame('lead_time' = c(0,1,2.5,3.5)), type = "response")

lm.fit 中的錯誤（x，y，offset = offset，singular.ok =singular.ok，...）：'y' 中的 NA/NaN/Inf，嘗試了所有可能的方法

問題描述

2 個解決方案

解決方案1
0 2019-09-27 07:26:25

解決方案2
0 已采納 2019-09-27 07:31:23

lm.fit 中的錯誤（x，y，offset = offset，singular.ok =singular.ok，...）：'y' 中的 NA/NaN/Inf，嘗試了所有可能的方法

問題描述

2 個解決方案

解決方案1 0 2019-09-27 07:26:25

解決方案2 0 已采納 2019-09-27 07:31:23

解決方案1
0 2019-09-27 07:26:25

解決方案2
0 已采納 2019-09-27 07:31:23