简体   繁体   中英

Simple logistic regression in MatLab - beginner help required

I'm trying to do a simple logistic regression analysis in MatLab.

X = [103.4843 103.4843 100.3871 101.8535 101.7658 101.9658];
Y = [120.9189 107.3617 122.5506 96.9701 101.9798 118.3035];
B = mnrfit(X,Y)

I keep getting this error:

If Y is a column vector, it must contain positive integer category numbers.

I'm not sure why. Can someone please help?! Thanks!

Pls read the documentation of mnrfit:

https://www.mathworks.com/help/stats/mnrfit.html#btmaowv-Y

Try use table and then let Y be the categorical array.

For example, my code:

%% Multinomial Logistic Regression

% read csv file and create table
% header = {'Year','Abortion','DowJones','Incarceration','Crime_Rate'};

data = csvread("E:\code\project\regression.csv",1,0);
year = data(:,1);
abortion = data(:,2);
dowjones = data(:,3);
incarceration = data(:,4);
crime_rate = data(:,5);
T = table(year,abortion,dowjones,incarceration,crime_rate);

% multinomial logistic regression
X = [abortion,dowjones,incarceration];
Y = categorical(crime_rate);

% B: coefficicent estimates
% dev: deviance of the fit
% stats: model statistics

[B,dev,stats] = mnrfit(X,Y,'Model','ordinal','link','logit'); 

hope this helps.

Logistic regression is used when dependent variable namely variable y is a binary number 0 or 1. Nominal Logistic Regression is quite wide as dependent variable could take more than 2 values, but they have to be consecutive natural numbers. For example Y = 0, 1, 2, 3, ... X, the independent variable doesn't have this restriction it can be any reel number.

To use mnrfit proceed as follow

X = [103.4843 103.4843 100.3871 101.8535 101.7658 101.9658];
if X > 103 --> X large --> translated to Y = 2
if 101 < X < 103 --> X medium --> translated to Y = 1
if X < 101 --> X small--> translated to Y = 0 

There are 3 categories : O small, 1 medium, 2 large Following the logic above

Y = [2 2 0 1 1 1]

Type the following code in matalb and check

X = [103.4843 103.4843 100.3871 101.8535 101.7658 101.9658];
Y = [2 2 0 1 1 1];
Y = categorical(Y);
B = mnrfit(X,Y);

According to your Y data format, I suggest you use polynomial linear regression model instead of logistic regression since your Y values are not discrete.

Polynomial linear regression

X = [103.4843 103.4843 100.3871 101.8535 101.7658 101.9658];
Y = [120.9189 107.3617 122.5506 96.9701 101.9798 118.3035];

B = polyfit(X,Y,length(X)-1);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM