Ordinal Logistic Regression or Proportional Odds Logistic Regression with R

Dr. Bharatendra Rai

zhlédnutí 53 899

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 21. 07. 2024
R file: drive.google.com/file/d/1B8lp...
TIMESTAMPS
00:00 Ordinal Logistic Regression with R
00:06 Read Data
02:36 Partition Datasets
03:28 Ordinal Logistic Regression Model
05:47 Calculating p-values
07:25 Prediction
09:08 Equations for Calculating the Probabilities
12:29 Model Building with all Variables
16:18 Confusion Matrix for Training Dataset
17:12 Confusion Matrix for Test Dataset
Time-Series videos: goo.gl/FLztxt
Machine Learning videos: goo.gl/WHHqWP
Becoming Data Scientist: goo.gl/JWyyQc
Introductory R Videos: goo.gl/NZ55SJ
Deep Learning with TensorFlow: goo.gl/5VtSuC
Image Analysis & Classification: goo.gl/Md3fMi
Text mining: goo.gl/7FJGmd
Data Visualization: goo.gl/Q7Q2A8
Playlist: goo.gl/iwbhnE
R is a free software environment for statistical computing and graphics, and is widely used by both academia and industry. R software works on both Windows and Mac-OS. It was ranked no. 1 in a KDnuggets poll on top languages for analytics, data mining, and data science. RStudio is a user friendly environment for R that has become popular.

Komentáře • 131

@gnomzb5070 Před 5 lety ⁺²
while I was looking for an example project on ordered logit model in R, I came across with this superb video. Thanks a lot, Bharatendra!
@bkrai Před 5 lety
Thanks for comments!
@flamboyantperson5936 Před 6 lety ⁺²
Really great tutorial. Thank you Sir.
@gabriellamartinez7985 Před 2 lety
Hello thank you for this video, its been super helpful!
I have a question regarding the dependent variables. How would you interpret the polr function output for dependent variables that are factors? For example, Tendency (levels: -1,0,1) was used as a dependent variable, how would you interpret each of the coefficients?
@hermanhyde7000 Před 7 lety ⁺⁴
Absolute genius. I would pay a million bucks to be your student.
@datascience1274 Před 2 lety
Hello Professor. Great lesson. Quick question. I was wondering if we could have used as.ordered(data$Tendency) instead of as.factor. Can you please share some light about this? Thanks a lot in advance
@euphorockz Před 4 lety ⁺³
This video really helps alot for my project! Thank you!!!!!
@bkrai Před 4 lety ⁺¹
Thanks for the feedback!
@jc.nogueira Před 2 lety ⁺¹
Great video! Many thanks for sharing this wonderful material. I will subscribe to your channel.
Greetings from Uruguay, South America!
All the best,
jc
@bkrai Před 2 lety ⁺¹
Thanks and welcome!
@victorhenostroza1871 Před 4 lety ⁺¹
Thank you so much
for this contribution...congratulations from Peru
@bkrai Před 4 lety
Thanks for comments!
@yubarsubedi2781 Před 3 lety
Hello Sir, Thank you so much for this tutorial. I leaned a lot. However, I encountered a problem. When I ran the summary commend, I encountered ..Error in svd(X) : infinite or missing values in 'x'.. message. how to fix this problem.
@dr.bheemsainik4316 Před 2 lety
Hi Sir... can you please explain the Ordered Probit model for the same data with a tendency with 3 levels as the dependent variable?
@hayonimengi4171 Před 5 lety
How would you interpret the predicted probabilities from a reference category of a categorical predictor? In other words I’m trying to present the probabilities which I get in my model however I’m confronted with my reference category and hence what would be the best way to derive these? Thanks
@aadvikpanda3339 Před 7 lety
Hello Sir ,
Great video.
I did not get the way you calculated probability from the t-stat using this formula
pnorm(abs(ctable[ ,"t value"]),lower.tail=FALSE)*2 .Could you please explain each term you have used in this formula and why?
@user-mo4gb2xb2h Před 2 lety ⁺¹
Thank you so much!! This video are extremly helpful and clear!!!
@bkrai Před 2 lety
You're so welcome!
@wasafisafi612 Před 2 lety ⁺¹
Big thanks for your video. It helps a lot
@bkrai Před 2 lety
You are welcome!
@1612kanika Před 6 lety
how to calculate bias and variance for ordinal.
@MKmadhurima Před 2 lety
Is there any way to do a ordinal logistics regression for panel Data?
@alainataylor4181 Před rokem
is there a way to add nested effects into the model???
@abhishekbansal5182 Před 4 lety
Thanks for making this video its very helpful for us
Plz sir can you explain how we get alpha values for categories. is there any formula to calculate tha alpha (@) plz explain it
@Sandra-tq6yb Před 2 lety ⁺¹
Very helpful video. Thank you very much!
@bkrai Před 2 lety ⁺¹
You're welcome!
@DAMGood73 Před 5 lety ⁺¹
Perfect, thanks for sharing!
@bkrai Před 5 lety
Thanks for comments!
@internetjunkie247 Před 3 lety
Thanks for the video. To calculate probabilities, why did you use alpha-b1x1+.... and not the conventional alpha+b1x1+... It seems different software uses different form of the equations (?) I believe, it its the former in R, perhaps SPSS too.
@88MSRobby Před 7 lety ⁺²
Very good video!
@WillIsGoodAtStatistics Před 4 lety ⁺¹
Excellent video. Thank you
@bkrai Před 4 lety
You are welcome!
@leliaglass1568 Před 5 lety ⁺¹
thanks for the video! very helpful
@bkrai Před 5 lety
Thanks for comments!
@shoumicshahid9315 Před 4 lety ⁺¹
Hello Professor, how can I rank the significant variables from an ordinal logit model? I previously performed dominance analysis on the binary logit model but in case of an ordinal logit model that seems inappropriate.
@bkrai Před 3 lety
One way could be to use p-value.
@fileniaantoniou8649 Před 5 lety ⁺¹
Hello and great video!
Would you suggest this model for modelling the results of a football game where the points earned in the end are 0,1 or 3?
@bkrai Před 5 lety
Yes, it should work for such data.
@mangaikalai82 Před 4 lety ⁺¹
Sir, This video was helpful. Can you make a video on Brant test for proportional odds assumption?
@bkrai Před 4 lety ⁺¹
Will try
@ganneesh Před 6 lety ⁺¹
Indeed, its a great video on Ordinal Logistic regression. Thanks professor, I am trying to create a model for my data set. i am facing an issue. When i ran predict command for my training data set, i am getting probability as very small value (summation of the probability is not equal to one). what could be the reason?
@bkrai Před 3 lety
Seeing this today. Probably resolved by now.
@miccoligno1 Před 5 lety ⁺¹
Hi Bharatendra, my respond variable is the score of a likert scale from 0 the worst condition to 4 the best. Should I use the function as.order? if yes, I should I keep the 4 as the best condition and the zero as the worst? Thanks
@bkrai Před 5 lety
Yes, that would work fine.
@smitagupta1771 Před 5 lety ⁺¹
what should be the change in Input file , if the independent variables have 3-4 level of ordinal category ? Should the independent variable be marked at 1,2,3,4 and then converted to ordinal factor like you did for NSP ?
@bkrai Před 5 lety
You can use ordered() for independent ordinal variable. Some researchers also recommend changing then to numeric variable as it leads to much simpler model.
@nageshgoud4266 Před 6 lety
Hi Sir, It's a nice video, I always follow you other videos, they are very good.
I am running the ordinal LR on my own data i.e., insurance to find the EMlevel and this dependent variable contains 6 levels i.e., 1,2,3....6. So as per your instructions I converted EMlevel variable to ordered and str is appearing as "EMLevel : Ord.factor w/ 6 levels "1"
@parthshah9451 Před 4 lety ⁺¹
Great Video Dr. Rai, Could you also help for Partial Proportional Odds Model
@bkrai Před 4 lety
Thanks, I've added it to my list.
@lauualb Před 7 lety
hi sir, how do you know the variable Max is causing the warning?
@bkrai Před 7 lety
+lauualb it was based on trial and error.
@taniamendoza9247 Před 4 lety ⁺¹
Dr, Thanks a lot for your example, but could you help me with a question, Which is the differece between clm and polr, becasue i was traying to use polr in financial rates to stimated rating, but your when i use this waring
Warning message:
glm.fit: fitted probabilities numerically 0 or 1 occurred
But if i use clm dont happend that
Could you help me to undertand this 2 functions
Thanks a lot
Regards from Ecuador
@bkrai Před 4 lety
Note that warning messages in R are ok. It's not an error.
@khadijabenmoussa8064 Před 4 lety ⁺¹
Hello, Thanks a lot for your video it is very helpful. Could you pelase explain what s the meaning of the confusion matrix error. Also please, how can we compute the R square of our model
@bkrai Před 4 lety ⁺¹
For confusion matrix you may refer to:
czcams.com/play/PL34t5iLfZddvv-L5iFFpd_P1jy_7ElWMG.html
Also note that when response is a factor variable, we do not use R-square.
@AddisuYohannes-h8p Před 13 dny
Consider my dependent variable is Anaemia status thesis on "mixed effect ordinal logistic regression"1. How can I obtain table on percentage of anaemia status by region in R software?
2. How can I obtain table on prevalence of anaemia status by predictors for anaemia among reproductive age of women in R software?
3. How can I obtain table on Adjusted odds ratio(AOR) and 95%CI of adjusted odds ratios(AOR) for mixed effect ordinal logistic regression in R software?
@sunilbobb Před 6 lety ⁺¹
sir - can u show how to do we interpet abalone data from kaggle or UCI
@bkrai Před 4 lety
I saw this today, hope it's taken care of.
@Astronoom Před 4 lety ⁺¹
Is this approach equal to the CatReg function in SPSS with ranking?
@bkrai Před 4 lety ⁺¹
I've not checked it in SPSS. But I guess results should be same.
@subashghimire1604 Před 7 lety ⁺²
Do you have any tutorial for goodness of fit test for ordinal logistic regression?
@tariqawanish Před 6 lety ⁺¹
is goodness for fit test is ap
plied in stata
@bkrai Před 4 lety
It already includes test of significance.
@Pinky-pb6od Před 6 lety ⁺¹
Hi sir. Can u please code support vector learning with ordinal regression
@bkrai Před 6 lety
Thanks for the suggestion, I'm adding it to my list for future.
@AymanTurkistani Před 2 lety ⁺¹
Thank you!
@bkrai Před 2 lety
You are welcome!
@drkim2 Před 6 lety ⁺¹
excellent
@mdtanimhasan3312 Před 3 lety ⁺¹
The video is really helpful.
I am struggling to see the dependent variable's factors outcome combined by or |
Could anyone please explain?
TIA
@bkrai Před 3 lety
1 | 2 means level-1 given level-2, and 2 | 3 means level-2 given level-3.
@mdtanimhasan3312 Před 3 lety
Dr. Bharatendra
Could you please explain how to interpret the outcome of the dependent variable combined with |
For example here is the summary and p-value of my model, I am struggling to interpreter the dependent variable outcome, TIA.
Coefficients:
Value Std. Error t value
H 0.10955 0.06687 1.6381
AGR 0.05929 0.06825 0.8687
NP2 -1.00909 0.30407 -3.3186
NP3 -1.69956 0.40289 -4.2184
NP4 -0.28106 0.44589 -0.6303
Intercepts:
Value Std. Error t value
1|2 -1.1571 0.6301 -1.8363
2|3 -0.0505 0.6090 -0.0829
3|4 0.9036 0.6022 1.5005
4|5 2.2627 0.7164 3.1584
5|6 5.1148 1.5859 3.2253
6|7 16.5213 9.1049 1.8145
Residual Deviance: 631.3888
AIC: 653.3888
Value Std. Error t value p-value
H 0.10954539 0.06687426 1.6380799 0.1014
AGR 0.05928751 0.06825109 0.8686676 0.3850
NP2 -1.00909459 0.30407139 -3.3186107 0.0009
NP3 -1.69956102 0.40288860 -4.2184390 0.0000
NP4 -0.28105858 0.44589078 -0.6303306 0.5285
1|2 -1.15712803 0.63014735 -1.8362817 0.0663
2|3 -0.05048673 0.60902379 -0.0828978 0.9339
3|4 0.90356996 0.60219631 1.5004575 0.1335
4|5 2.26273192 0.71641548 3.1584073 0.0016
5|6 5.11484231 1.58585762 3.2252847 0.0013
6|7 16.52126027 9.10488998 1.8145480 0.0696
@nimeshcheedella8124 Před 6 lety ⁺¹
Sir , very nicely explained. I tried with my data by following your vedio step by step. But one issue. I have a data independent variables are also ordinal in nature . I made into categorical is it correct? which regression you suggest to predict a ordinal variable and independent variable also ordinal.?
@bkrai Před 3 lety ⁺¹
The method depends on the dependent variable and not much on the independent variable.
@subashghimire1604 Před 7 lety
Hello, could you please tell me how did you get equations for probability, at 9:31/19:21 in above video
@bkrai Před 7 lety
It is similar to steps shown in the link below at 4:13,
czcams.com/video/fDjKa7yWk1U/video.html
@yujiaoli947 Před 6 lety
I have the same question. Only z-statistics' p-value can be calculated by pnorm() while hereby it is t-statistic.
@landersebastian7886 Před rokem ⁺¹
good day professor how can I use Ordinal Logistic regression with bmi
@bkrai Před rokem
See if this research paper helps:
www.researchgate.net/publication/260273192_Does_Consumer_Behaviour_on_Meat_Consumption_Increase_Obesity_-_Empirical_Evidence_from_European_Countries
@nicolasaguirre8170 Před 4 lety ⁺¹
how can i fit a model with ordinal response without proportional odds?
@bkrai Před 4 lety
You can try this:
czcams.com/video/dJclNIN-TPo/video.html
@hayonimengi4171 Před 5 lety ⁺¹
Superb!!!!
@bkrai Před 5 lety
Thanks!
@dearcollynn3498 Před 7 lety
Hello, thank you for your great video. I have a question. Is AIC important here? Isn't AIC here big for the model since it is larger than 1000 already?
@bkrai Před 7 lety
Yes it is high. In the same example when we made a model with three variables, it was over 1700. By adding more variables it came down to about 1038, which is a significant improvement.
@Astronoom Před 4 lety
When you add more variables the AIC goes down, but then you select variables which have a significant level >0.1 and the AIC goes back up, isn’t it? Wouldn’t you use the model with the lowest AIC, and if not why use the AIC at all? Can I compare models with the AIC as well when in some models variables are log transformed as in others they are not log transformed?
@kaapiglass Před 4 lety ⁺¹
I'm getting this kind of error do you know what this mean?
Warning message:
In polr(AccessOnlineRecord ~ ., trainHint, Hess = TRUE) :
design appears to be rank-deficient, so dropping some coefs..........
@bkrai Před 4 lety
It is just a warning message, not an error.
@alfredkik3675 Před 3 lety ⁺¹
Excellent tutorial!
@bkrai Před 3 lety
Thanks!
@alfredkik3675 Před 3 lety
@@bkrai Hello again Dr Rai, I tried to perform an OLR but the brant test assumption did not hold. Omnibus plus other variable were less than 0.05. What else should I do? is there any alternative test for ordinal dependent variables? Your kind advice will be greatly appreciated.
@SandeepKumar-me6qr Před 5 lety ⁺¹
Very Nice explanation sir. Can you please upload the Cardiotocographic.csv file?
@bkrai Před 5 lety
Here is the link: goo.gl/Xc4G7J
@seant7907 Před 4 lety ⁺¹
what does it mean to be 'rank defficient'?
@bkrai Před 3 lety
Which part of the video are you referring to?
@nasamumusa5044 Před 7 lety
Thank you Bharatendra Rai. I get your explanation and have adapted my work well following the steps shown in your video.
I have one issue please. Where columns with independent categorical data having 3 or more levels like the column of "Tendency" shown in your video; the model gives different "Value", "Std. Error", "t value" and "p value" for each level of such variable.
This seems challenging and confusing to interpret and write out the equation of the model as some of the p values of the levels may not be significant, which should be removed while the other levels been significant are left.
How can such a model be clearly written out and explained?
Gracias!
@bkrai Před 7 lety ⁺¹
When a independent variable is categorical and takes three values, the correct way to represent it in a regression based model is with the help of 3-1=2 dummy variables. That's what you see here. When Tendency0 & Tendency1 are both zero, then Tendency = -1. When Tendency0 =1 & Tendency1 = 0, then Tendency = 0. When Tendency0 = 0 & Tendency1 = 1, then Tendency = 1. Note that in the equation Tendency0 & Tendency1 can only 0 or 1.
@nasamumusa5044 Před 7 lety
Bharatendra Rai in my case I used dummy variables of 1,2,3 for the three levels my independent categorical data. (Probably I should start with zero?)
I converted them to factors. With some independent variables which were continous or categorical and the dependent variable, I ran the model using polr.
The output gave me always a coffeficient value for the continous independent variables whereas the categorical ones had different coffeficient for each level. Like with yours Tendency 0 had different coffeficient and p values from Tendency 1 and both were significant.
However, when I found the significancy of my data from their p values. I observed that the p value of the various levels differ in some variable (say e.g. edu with levels 1,2,3. R choose level 1 as reference level and so level 3 had value greater than 0.05 while level 2 had p value less than 0.05). I should remove the level 3 too as I remove the non significant variables from the equation I suppose.
How can I do so and what may be the following interpretation.
Thanks for your kind offer to help.
@bkrai Před 7 lety ⁺¹
For categorical variables, even if one level is significant, do not drop the variable from the model.
@nasamumusa5044 Před 7 lety
Bharatendra Rai I sincerly appreciate your explanation. It is noted.
@bkrai Před 7 lety
+Nasamu Bawa great 👍
@zahradidarali5804 Před 4 lety ⁺¹
What are your thoughts on AIC?
@bkrai Před 4 lety
It estimates model related error. It is lower the better type of metric and helps to assess model quality. It is used for model selection or comparison.
@Nientjuh22 Před 4 lety ⁺¹
Does anyone know if there is a maximum of independent factors R can handle for this model? I have 6 factors and it gives me an error. However, if I only use 5 of them, no matter which of them, R works perfectly normal
@bkrai Před 4 lety
It must be some other issue. In this example I've used 21 variables without any problem.
@Nientjuh22 Před 4 lety
@@bkrai Thanks! But the error I get is: attempt to find suitable starting values failed
In addition: Warning messages:
1: glm.fit: algorithm did not converge
2: glm.fit: fitted probabilities numerically 0 or 1 occurred
@adarsha1981 Před 6 lety ⁺¹
Sir, does Ordinal Regression and Ordinal Logistic Regression are one and the same or are they different?
@bkrai Před 6 lety
Ordinal logistic regression is one type of ordinal regression.
@adarsha1981 Před 6 lety
ok.. what kind of ordinal regression you would suggest to a situation where, i have 15 features with 3 features integer, 3, numeric and 8 categorical (binary) and 1 count variable (dependent).. i followed logistic ordinal but not a better result.. i have zero inflated count and tried ZIP model too.. not that great.. ..and cumulative link model(clm) is not fitting as well..kindly suggest
@bkrai Před 6 lety
what is your response variable?
@adarsha1981 Před 6 lety
@@bkrai it's count and also I tried with ranking it .. I have more zeros
@micheleannarumma4690 Před 5 lety ⁺¹
thank you :)
@bkrai Před 5 lety
Thanks for your comment!
@R.K.3010 Před 7 lety
Hello sir,
I am getting the following error
"Error in optim(s0, fmin, gmin, method = "BFGS", ...) :
initial value in 'vmmin' is not finite
In addition: Warning message:
glm.fit: fitted probabilities numerically 0 or 1 occurred"
can you explain this?
@bkrai Před 7 lety
Send the codes that you used to look at.
@R.K.3010 Před 7 lety
mod
@natasabajic7072 Před 7 lety
@Rahul Kadge could you find a solution for this error, I got the same and would like to know how you solved it. Thanks,
@thejuhulikal6290 Před 3 lety ⁺¹
Sir which model I should use if all the variables both dependent and independent are categorical. Please help me with this
@bkrai Před 3 lety ⁺²
Try Random Forest:
czcams.com/video/dJclNIN-TPo/video.html
@thejuhulikal6290 Před 3 lety ⁺¹
@@bkrai thanks again sir, grateful forever
@bkrai Před 3 lety
You are welcome!
@thejuhulikal6290 Před 3 lety ⁺¹
@@bkrai sir I am getting much error, can I have your mail id, please.
@bkrai Před 3 lety
seemabharat@gmail.com
@kmahim82 Před 4 lety ⁺¹
what if the intercept is insignificant
@bkrai Před 4 lety ⁺¹
That's ok, we should still keep it.

Další v pořadí

Automatické přehrávání