Archive

Posts Tagged ‘OLOGIT’

OLOGIT

July 19, 2010 Leave a comment

POL 681 STATA NOTES: ORDINAL LOGIT and ORDINAL PROBIT

Here is our dependent variable (attitudes toward affirmative action):

. table aff2

———————-

Pref.     |

Hiring    |

scale     |      Freq.

———-+———–

1 |      1,332

2 |        382

3 |        179

4 |        253

———————-

Suppose we estimate a regression model? Interpretation?

. reg aff2 white2 prej

Source |       SS       df       MS              Number of obs =    2084

————-+——————————           F(  2,  2081) =  234.85

Model |  414.673135     2  207.336567           Prob > F      =  0.0000

Residual |   1837.2309  2081  .882859633           R-squared     =  0.1841

————-+——————————           Adj R-squared =  0.1834

Total |  2251.90403  2083  1.08108691           Root MSE      =  .93961

——————————————————————————

aff2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

————-+—————————————————————-

white2 |  -.8906507   .0516212   -17.25   0.000    -.9918853   -.7894162

prej |   .0391314   .0044243     8.84   0.000      .030455    .0478078

_cons |   1.558594   .1130557    13.79   0.000      1.33688    1.780308

Problems?  Suppose we rescale y:

. table pref2

———————-

pref2 |      Freq.

———-+———–

-10 |      1,332

-5 |        382

0 |        179

1 |        253

———————-

Now, let’s reestimate the model:

. reg pref2 white2 prej

Source |       SS       df       MS              Number of obs =    2084

————-+——————————           F(  2,  2081) =  208.05

Model |  6206.38305     2  3103.19152           Prob > F      =  0.0000

Residual |  31039.5479  2081  14.9156885           R-squared     =  0.1666

————-+——————————           Adj R-squared =  0.1658

Total |  37245.9309  2083  17.8809078           Root MSE      =  3.8621

——————————————————————————

pref2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

————-+—————————————————————-

white2 |  -3.330481   .2121796   -15.70   0.000    -3.746588   -2.914375

prej |   .1657354   .0181851     9.11   0.000     .1300725    .2013983

_cons |  -7.918823   .4646954   -17.04   0.000    -8.830139   -7.007507

Uh oh.

Just for a sneak peek, suppose I apply a model for ordinal categorical data (like ordinal logit or ordinal probit) to this newly created variable? First, let’s take the original variable:

. ologit aff2 white2 prej

Iteration 0:   log likelihood = -2208.5474

Iteration 1:   log likelihood = -2046.6793

Iteration 2:   log likelihood = -2041.0539

Iteration 3:   log likelihood = -2041.0492

Ordered logit estimates                           Number of obs   =       2084

LR chi2(2)      =     335.00

Prob > chi2     =     0.0000

Log likelihood = -2041.0492                       Pseudo R2       =     0.0758

——————————————————————————

aff2 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

————-+—————————————————————-

white2 |  -1.527657   .1085576   -14.07   0.000    -1.740426   -1.314888

prej |   .0886513   .0101943     8.70   0.000     .0686709    .1086318

————-+—————————————————————-

_cut1 |   1.200597   .2541018          (Ancillary parameters)

_cut2 |   2.215088   .2574794

_cut3 |   2.950627   .2613229

——————————————————————————

Now, let’s take the rescaled version:

ologit pref2 white2 prej

Iteration 0:   log likelihood = -2208.5474

Iteration 1:   log likelihood = -2046.6793

Iteration 2:   log likelihood = -2041.0539

Iteration 3:   log likelihood = -2041.0492

Ordered logit estimates                           Number of obs   =       2084

LR chi2(2)      =     335.00

Prob > chi2     =     0.0000

Log likelihood = -2041.0492                       Pseudo R2       =     0.0758

——————————————————————————

pref2 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

————-+—————————————————————-

white2 |  -1.527657   .1085576   -14.07   0.000    -1.740426   -1.314888

prej |   .0886513   .0101943     8.70   0.000     .0686709    .1086318

————-+—————————————————————-

_cut1 |   1.200597   .2541018          (Ancillary parameters)

_cut2 |   2.215088   .2574794

_cut3 |   2.950627   .2613229

——————————————————————————

Moving on to probabilities.  To keep things simple, I’m only going to use a single binary covariate.  Let’s run the following model:

. ologit aff2 white2

Iteration 0:   log likelihood = -2256.8146

Iteration 1:   log likelihood = -2131.0014

Iteration 2:   log likelihood = -2126.6825

Iteration 3:   log likelihood = -2126.6804

Ordered logit estimates                           Number of obs   =       2128

LR chi2(1)      =     260.27

Prob > chi2     =     0.0000

Log likelihood = -2126.6804                       Pseudo R2       =     0.0577

——————————————————————————

aff2 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

————-+—————————————————————-

white2 |  -1.688567   .1051639   -16.06   0.000    -1.894684   -1.482449

————-+—————————————————————-

_cut1 |   -.839396   .0938652          (Ancillary parameters)

_cut2 |   .1540845   .0918702

_cut3 |   .8629134   .0954182

As with many Stata commands, we could use predict to generate predicted probabilities.  Note that I have to define the probabilities (why?):

. predict p1 p2 p3 p4

(option p assumed; predicted probabilities)

(26 missing values generated)

For illustrative purposes, let’s tabulate them by the value of the race indicator:

. table p1 white2

———————-

Pr(aff2== |   white2

1)        |    0     1

———-+———–

.301662 |  563

.7003931 |       1,896

———————-

. table p2 white2

————————

Pr(aff2== |    white2

2)        |     0      1

———-+————-

.1628688 |        1,896

.2367831 |   563

————————

. table p3 white2

————————

Pr(aff2== |    white2

3)        |     0      1

———-+————-

.0644109 |        1,896

.1648239 |   563

————————

. table p4 white2

————————

Pr(aff2== |    white2

4)        |     0      1

———-+————-

.0723271 |        1,896

.296731 |   563

————————

Of course I could have computed these by hand using the following functions (for the case when white2=1):

. display 1/(1+exp(_b[white2]*1-_b[_cut1]))

.70039314

. display  1/(1+exp(_b[white2]*1-_b[_cut2]))-1/(1+exp(_b[white2]*1-_b[_cut1]))

.16286881

. display  1/(1+exp(_b[white2]*1-_b[_cut3]))-1/(1+exp(_b[white2]*1-_b[_cut2]))

.06441093

. display  1-1/(1+exp(_b[white2]*1-_b[_cut3]))

.07232712

Verify these are identical to the above calculations.

Now let’s look at the cumulative probabilities (for the case of white2=0).  The first cumulative probability is Pr(y<=1), which is:

.  display p1

.301662

The second is Pr(y<=2):

.  display p1+p2

.53844507

The third is Pr(y<=3)

.  display p1+p2+p3

.70326896

The fourth is Pr(y<=4) (trivial case)

.  display p1+p2+p3+p4

.99999999

[Why must this one be 1?]

Now look at the following.  What is the difference between the second cumulative probability and the first?

.54-.30=.24

What is the difference between the third cumulative probability and the first and second?

.70-.54=.16

What is the difference between the fourth cumulative probability and the first three?

1.0-.70=.30.

What do these numbers correspond to?

<– – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – ->

y*                           |                     |              |

1                       2              3

Pr=0                    |                    |           |                  Pr=1

Pr(y<=1)            Pr(y<=2)    Pr(y<=3)

=.30             =.54         =.70

————–   ———   —————-

|                |          |

.24               .16        .30

These numbers should look familiar.  Return to the table of probabilities.  The differences across the cumulative probabilities are equal to conditional probabilities.  This is obvious. However, the important thing to note is that in this example, since white2=0, the only component that shifts the cumulative probabilities are the cutpoints.  This raises a more general point: since the regression coefficients, x’β, are the same for all J cutpoints, changes to the cumulative probability curves will be solely due to the cutpoints. This is what is known as the “parallel slopes” assumption.

Let’s turn now to ordinal probit. Here I estimate a model with the same covariate as before.  The coefficients are now expressed in terms of inverses of the cumulative normal distribution (i.e. 1/Φ).

. oprobit aff2 white2

Iteration 0:   log likelihood = -2256.8146

Iteration 1:   log likelihood = -2118.8459

Iteration 2:   log likelihood = -2118.7591

Ordered probit estimates                          Number of obs   =       2128

LR chi2(1)      =     276.11

Prob > chi2     =     0.0000

Log likelihood = -2118.7591                       Pseudo R2       =     0.0612

——————————————————————————

aff2 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

————-+—————————————————————-

white2 |  -1.019296   .0612999   -16.63   0.000    -1.139441   -.8991502

————-+—————————————————————-

_cut1 |  -.4874818   .0552574          (Ancillary parameters)

_cut2 |   .1004991   .0548217

_cut3 |   .4979683   .0557226

——————————————————————————

I can use Stata to compute the probabilities (note again I must identify the probabilities):

. predict pr1 pr2 pr3 pr4

(option p assumed; predicted probabilities)

(26 missing values generated)

For illustrative purposes, let’s tabulate them by the value of the race indicator:

. table pr1 white2

———————-

Pr(aff2== |   white2

1)        |    0     1

———-+———–

.3129585 |  563

.7025726 |       1,896

———————-

. table pr2 white2

————————

Pr(aff2== |    white2

2)        |     0      1

———-+————-

.1660268 |        1,896

.2270675 |   563

————————

. table pr3 white2

————————

Pr(aff2== |    white2

3)        |     0      1

———-+————-

.0668006 |        1,896

.1507208 |   563

————————

. table pr4 white2

————————

Pr(aff2== |    white2

4)        |     0      1

———-+————-

.0646 |        1,896

.3092532 |   563

————————

I could also compute these probabilities using Stata as a calculator.  For the case when white2=1, we obtain:

. display norm(_b[_cut1]-_b[white2]*1)

.70257259

. display norm(_b[_cut2]-_b[white2]*1)-norm(_b[_cut1]-_b[white2]*1)

.16602683

. display norm(_b[_cut3]-_b[white2]*1)-norm(_b[_cut2]-_b[white2]*1)

.06680057

. display 1-norm(_b[_cut3]-_b[white2]*1)

.06460001

Verify that these agree with Stata’s computations.  (They do).

One last little thing.  There is an equivalency between the binary version of these models and the ordinal version.  To “prove” this, return to logit.  Let’s apply ologit to a binary dependent variable:

. ologit hihi white2

Iteration 0:   log likelihood =   -673.904

Iteration 1:   log likelihood = -660.43261

Iteration 2:   log likelihood = -659.29047

Iteration 3:   log likelihood = -659.28661

Ordered logit estimates                           Number of obs   =       2459

LR chi2(1)      =      29.23

Prob > chi2     =     0.0000

Log likelihood = -659.28661                       Pseudo R2       =     0.0217

——————————————————————————

hihi |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

————-+—————————————————————-

white2 |  -.8732477   .1561835    -5.59   0.000    -1.179362   -.5671336

————-+—————————————————————-

_cut1 |   1.857531   .1233321           (Ancillary parameter)

Now let’s apply logit:

. logit hihi white2

Iteration 0:   log likelihood =   -673.904

Iteration 1:   log likelihood = -660.43261

Iteration 2:   log likelihood = -659.29047

Iteration 3:   log likelihood = -659.28661

Logit estimates                                   Number of obs   =       2459

LR chi2(1)      =      29.23

Prob > chi2     =     0.0000

Log likelihood = -659.28661                       Pseudo R2       =     0.0217

——————————————————————————

hihi |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

————-+—————————————————————-

white2 |  -.8732477   .1561835    -5.59   0.000    -1.179362   -.5671336

_cons |  -1.857531   .1233321   -15.06   0.000    -2.099257   -1.615804

What is the difference?  The coefficient is identical; however, in the ordered logit model, _cut1=-b0.  Why?

The answer is straightforward. Now figure it out.

Categories: Ekonometrik Tags: