简体   繁体   English

如何对具有相同数字的变量进行回归?

[英]How to do a regression with variables with the same number?

I created this kind of database that has 8 variables and I have 400 row like that. 我创建了一个具有8个变量的数据库,并且有400行这样的数据库。 My dependent variable is the sum of all the freight that there are in 20 regions. 我的因变量是20个地区中所有货运的总和。 The w_o , v_o and u_d are population,gdp, and km of highway of the region. w_ov_ou_d分别是该地区的人口,gdp和公里的公路。

    fulldata = cbind(matrix(a,400,1),orig, dest, matrix(distanz,400,1))
    fulldata
               dep   u_o      v_o w_o   u_d      v_d w_d distanz
    [1,]  46101718 27253  4392526 821 27253  4392526 821      89
    [2,]    204380 32141   126883 114 27253  4392526 821     113
    [3,]   5789359 28238  1565307 375 27253  4392526 821     170
    [4,]  11449059 33745 10019166 679 27253  4392526 821     138
    [5,]    389580 35525  1062860 212 27253  4392526 821     289
    [6,]   2642751 29003  4907529 576 27253  4392526 821     405
    [7,]    231159 27532  1217872 210 27253  4392526 821     541
    [8,]   2844613 31539  4448841 568 27253  4392526 821     327
    [9,]   1481309 27821  3742437 448 27253  4392526 821     400
    [10,]    399624 22396   888908  59 27253  4392526 821     551
    [11,]    262570 24726  1538055 168 27253  4392526 821     544
    [12,]    499115 29624  5898124 485 27253  4392526 821     669
    [13,]    249596 22945  1322247 352 27253  4392526 821     720
    [14,]     42501 18447   310449  36 27253  4392526 821     857
    [15,]    273450 16219  5839084 442 27253  4392526 821     869
    [16,]    306917 16512  4063888 313 27253  4392526 821     998
    [17,]    167326 19663   570365  29 27253  4392526 821     995
    [18,]     26384 15514  1965128 295 27253  4392526 821    1275
    [19,]     20189 16289  5056641 662 27253  4392526 821    1584
    [20,]         0 18539  1653135  23 27253  4392526 821     933

Now I have to do a regression with this 20 row, where my y should be the "dep" column. 现在,我必须对这20行进行回归分析,其中y应该是“ dep”列。 I tried with this code : 我尝试使用此代码:

    lm <- lm(fulldata[1:19]~fulldata[1:19,2]+fulldata[1:19,3]+fulldata[1:19,4]+fulldata[1:19,5]+fulldata[1:19,6]+fulldata[1:19,7]+fulldata[1:19,8])

and the result was : 结果是:

    summary(lm)
    Call:
    lm(formula = fulldata[1:19] ~ fulldata[1:19, 2] + fulldata[1:19, 
    3] + fulldata[1:19, 4] + fulldata[1:19, 5] + fulldata[1:19, 
    6] + fulldata[1:19, 7] + fulldata[1:19, 8])

    Residuals:
    Min       1Q   Median       3Q      Max 
    -7970288 -6278944    31922  3227442 15159011 

    Coefficients: (3 not defined because of singularities)
                         Estimate Std. Error t value Pr(>|t|)   
    (Intercept)        3.805e+07  1.668e+07   2.282  0.03866 * 
    fulldata[1:19, 2] -1.185e+03  5.006e+02  -2.368  0.03283 * 
    fulldata[1:19, 3] -1.727e+00  1.076e+00  -1.605  0.13089   
    fulldata[1:19, 4]  4.252e+04  1.195e+04   3.558  0.00315 **
    fulldata[1:19, 5]         NA         NA      NA       NA   
    fulldata[1:19, 6]         NA         NA      NA       NA   
    fulldata[1:19, 7]         NA         NA      NA       NA   
    fulldata[1:19, 8] -2.390e+04  7.779e+03  -3.072  0.00828 **
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

    Residual standard error: 6894000 on 14 degrees of freedom
    Multiple R-squared:  0.6714,    Adjusted R-squared:  0.5775 
    F-statistic: 7.151 on 4 and 14 DF,  p-value: 0.002359

It is right the regression code? 回归代码对吗? Having 3 column with the same number the result of the coefficient is NA and I don't know how to avoid it. 具有3个具有相同编号的列,系数的结果为NA,但我不知道如何避免。 I hope i was clear Thanks to all 我希望我很清楚感谢所有人

You have NA 's in these columns because they are constants. 这些列中包含NA ,因为它们是常量。 You already have a constant in the form of an intercept of your regression model, thus these columns of information play no role. 您已经具有回归模型的截距形式的常量,因此这些信息列不起作用。 They don't vary, so they can't explain variation in your dependent variable. 它们没有变化,因此无法解释因变量的变化。 They're not informative. 他们没有提供信息。

You should just drop them from the regression equation. 您只需将它们从回归方程式中删除即可。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 当输入数据集之间的变量数量不同时,如何自动指定正确的回归模型? - How do I automatically specify the correct regression model when the number of variables differs between input data sets? ggplot2:y 轴上有两个变量(以相同比例测量)的散点图:如何更改美学并添加单独的回归线? - ggplot2: scatterplot with two variables (measured on the same scale) on the y-axis: how do I change the aesthetics & add seperate regression lines? 在插入符号中使用 LmFuncs(线性回归)进行递归特征消除:如何修复“x 和 y 中的样本数相同”错误? - Using LmFuncs (Linear Regression) in Caret for Recursive Feature Elimination: How do I fix "same number of samples in x and y" error? 如何使用列名作为因变量进行线性回归 - How to do linear regression using columns-names as dependent variables 如何在R中的回归公式中使用变量? - How do you use variables in a regression formula in R? 如何在 R 中创建具有各种变量的回归线 - How do I create a regression line with various variables in R 如何在不键入每个变量名称的情况下对一系列变量进行回归 - How to do a regression of a series of variables without typing each variable name 如何使用循环在 R 中使用不同变量运行回归? - How do I run a regression with different variables in R using a loop? 如何在 r 中进行逐步回归以获得更多的自变量和更少的观察值? - How to do stepwise regression in r for more independent variables and less observations? 可变长度不同帮助,lm回归和权重,变量具有相同的行数 - Variable lengths differ help, lm regression and weights, variables have the same number of rows
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM