Richard Williams, Notre Dame Sociology

gologit2/oglm Troubleshooting

Richard Williams, University of Notre Dame

NOTE!!! gologit2 & oglm now support factor variables and the svy: prefix. They require Stata 11.2 or higher. The old versions have been renamed gologit29 & oglm9. Use them if you are condemned to using earlier versions of Stata.

Here are some of the main issues that I get asked about with gologit2. (some of these points also apply to oglm).  Feel free to email me if you have other problems, suggestions or recommendations, or to let me know what recommendations worked best for you.  Click here if you want the main gologit2 support page. Click here for the main oglm page.

Universal Recommendation.  Make sure you have the most current version of the program (and also the most up-to-date version of the Stata software you are using.).  If you are lucky the problem you are encountering may have already been fixed.  From within Stata, type

ssc install gologit2, replace
ssc install oglm, replace
update all

If you have Stata 9 or higher you can also use the adoupdate command.  Also, the most up-to-date version of the documentation is gologit2.pdf

General tip for weird and inexplicable errors: Try running gologit2 with the nolabel option.  This will cause the equations to be labeled eq1, eq2, etc.  The printout may not be as aesthetically appealing but this may reduce the likelihood of having problems with gologit2 itself or with other commands that have trouble with your labels (e.g. value labels that start with a number sometimes cause problems).  Changing your value labels may also solve the problem.  I have found that value labels that work fine when svy and gsvy are NOT used can create weird errors when they are. suest can have problems with value labels too.

For security or other reasons, my computer can't access the Internet.  How can I install your programs?  This must be a real nuisance for you!  You may want to talk to your computing people to see if they can't find a way to make your life easier.  Some possible solutions are:

* If you have another computer that has Stata and can access the Internet, install the programs on it.  Then, copy c:\ado (or whatever the appropriate directory is on your machine) from one computer to the other.  Be sure you understand what you are doing, because you don't want to accidentally overwrite files that are needed on the non-Internet machine.

* Following are zipped versions of my programs and their support files.  Unzip the files and store them in c:\ado\personal or some other location where Stata can find them.  Again, if you don't understand how to do this, find somebody on your support staff who can help you out.

gologit2_ver3.2.5.zip (Supports Factor variables & svy prefix; requires Stata 11.2 or higher)
gologit29 version 2.1.7 (Old version of gologit2; Requires Stata 8.2 or higher)
oglm version 2.3.0 (Supports Factor variables & svy prefix; requires Stata 11.2 or higher)
oglm9_ver1.2.0.zip (old version of oglm; requires Stata 9.2 or higher)
mfx2 version 1.2.0 (Requires Stata 8.2)

* Sometimes people have read/write access to some drives but not others.  If so, you may be able to modify these suggestions for Notre Dame users.  Basically, the "trick" is to get Stata to look for programs in a folder that you have control over.

The output is hard to read and understand.  What are some good ways to interpret and present the results?  For interpreting the results -- see my Stata Journal article, especially section 3.1.  Also see my 2016 Journal of Mathematical Sociology article. (Email me if you don't have free access.) Also look at this presentation and handout. A few additional points, and some comparisons with oglm, are made in this presentation and handout.

For presenting the results -- the default gologit2 output is indeed a little hard to read, because you keep seeing the same numbers over and over for those variables that meet the parallel lines constraint.  A more parsimonious layout is achieved with the gamma option; see sections 3.2 and 3.8 of my Stata Journal article.  I also like the way Thomas Craemer formatted this table (version1, version2); he only presents multiple coefficients for a variable when necessary.  The citation for his complete paper (which also includes a nice discussion of the gologit model) is Craemer, Thomas.  2009.  Psychological 'self-other overlap' and support for slavery reparations. Social Science Research 38: 668--680. I modified his approach for Table 2 of my 2016 article in the Journal of Mathematical Sociology. You may also find that the margins command provides an effective way of presenting and interpreting results; see my discussions here and here.

I don't have Stata.  Is there any other way I can estimate gologit models?  In R, I believe that gologit models can be estimated with the VGAM package. I've never used it myself, but I understand that Don Hedeker's mixor program can do many of the same things that gologit2 can.  Somebody who is familiar with both programs said that "Hedeker's software does gologit2,  but with random effects. I would assume that if you don't specify a random effect you get the same results. His program doesn't do a lot of the cool things that yours does, but if you have specific non-proportionality hypotheses in mind, it will test them and produce the non-proportional results."  Hedeker's web page also includes programs or code for DOS, SPSS and SAS. There is also a commercial verison of the program called SuperMix.

Can I do a random effects or multilevel model with gologit2?  No.  Instead, check out Stefan Boes's regoprob program, or else regoprob2 written by Pfarr, Schmid and Schneider. Both use code adapted from gologit2 and reoprob.  Also, I've never tried it myself, but gllamm (also downloadable from SSC) has been used by some people to estimate gologit2-type models - see these Statalist posts from Mirko Moro and Richard Williams. Or, Don Hedeker's mixor program or SuperMix may do what you want.

Note: There is (or was) a problem with the regoprob program on ssc, which caused both regoprob and regoprob2 to give errors. A patched version of regoprob can be found here. Unzip the files and place them in C:\ado\personal or somewhere else where Stata will find them.

Note: If you have a multilevel model and aren't too worried about violations of the parallel lines assumption, then consider using the xtologit, meologit, xtoprobit, or meoprobit commands. You might also consider using xtmlogit, available in newer versions of Stata. Or, you can use gsem to estimate a multilevel mlogit model.

How do I change the base category in gologit2?  You can't (although you could reverse the coding of your dependent variable if you liked that better). Suppose your dependent variable has four categories. Although the gologit2 output looks a lot like mlogit output, it doesn't make any sense to think of there being a single "base" category. Rather, the gologit results are like a series of logistic regressions. In the first panel, it is like a logistic regression where category 1 = 0 and categories 2, 3, 4 = 1. In the 2nd panel, it is 1 & 2 = 0 and 3, 4 = 1. 3rd panel, 1, 2, 3 = 0, 4 = 1. There is no 4th panel because if there were it would be like 1, 2, 3, 4 = 0, nothing equals 1. If the assumptions of the ordered logit model are met, the coefficients should all be the same in each panel (except for the intercepts). For more, see section 3.1 of my Stata Journal article.

gologit29 does not work with Long & Freese's spost9 routines.  First off, unless you are condemned to using ancient versions of Stata, you are much better off using gologit2 and Long & Freese's spost13 commands. But if you are so condemned this is covered in the help file but many people miss the advice.  Add the v1 option to gologit29, e.g.

gologit29 y x1 x2 x3, v1

Some (but not all) of Long and Freese's spost9 routines currently work with the original gologit but not gologit29.  The v1 option saves the internally-stored results in the same format that was used by gologit.  However, you can still use gologit29's other unique options, such as autofit or pl.  Note that post-estimation commands written specifically for gologit29 (including the pr option of predict) may not work correctly if you use the v1 option.  In that case just rerun the model without it.  Also, the v1 option only works with the default logit link, since that is all the original gologit supported. spost13 provides much better support for both gologit29 and gologit2; use gologit2 if at all possible.

How do I estimate marginal effects with gologit29 & oglm9?  If you have a 21st century version of Stata, use gologit2 and oglm and the margins command instead. But if you don't...Stata's mfx command will work.  However, it is generally better to use my mfx2 program, which can be downloaded and installed from SSC (ssc install mfx2). mfx2 makes it easier to compute marginal effects after multiple-outcome commands like oglm9, gologit29, ologit, oprobrit, mlogit, mprobit and slogit.  In addition, the results are formatted in a way that makes them compatible with post-estimation table formatting commands like outreg2 and estout

The predict command comes up with negative predicted probabilities (or else predicted probabilities greater than 1).  Believe it or not, negative predicted probabilities are possible.  McCullagh & Nelder discuss this in Generalized Linear Models, 2nd edition, 1989, p. 155:

The usefulness of non-parallel regression models is limited to some extent by the fact that the lines must eventually intersect.  Negative fitted values are then unavoidable for some values of x, though perhaps not in the observed range.  If such intersections occur in a sufficiently remote region of the x-space, this flaw in the model need not be serious.

So yes, it can happen, and a couple of people have written me about this.  But, they've also mentioned things like extremely high standard errors or other problems, so I suspect that in most cases a solution lies somewhere in the next couple of points. 

I do recommend computing the predicted probabilities under your models; if they seem implausible, then you may wish to modify your model or use a different statistical technique altogether. (One person wrote me that 2 cases out of 27,068 had negative predicted probabilities; I probably wouldn't worry too much in a case like that, but I would get worried if a non-trivial number of cases had negative predicted values.)  Sometimes combining categories of the response variable (especially if the Ns for some categories are small) and/or simplifying the model helps.  The imposition of parallel lines constraints (either via autofit or the pl or npl options) may also help because it reduces the likelihood of non-parallel lines intersecting. 

Click here for an example of the problem and a solution.

The standard errors are extremely high.  You may have high multicollinearity in your variables.  User-written routines like collin can check for this.  But, routines like ologit and gologit2 can also have problems when an X variable has little or no variability within a category of Y, e.g. when Y = 2 X always equals 0.  In ologit, you might get a warning message like this:

Note: 40 observations completely determined.  Standard errors questionable.

In gologit2, alas, such a warning is still on the "wish list" of things I'd like to add.  But, the high standard errors will still be a clue.  Possible diagnostic devices:

If lack of x variability or extreme multicollinearity within a category of y is the problem - you'll have to decide what to do.  You may want (or be forced) to drop the problematic variable.  Maybe y has too many categories with small Ns, and some will need to be combined.  When logit encounters such a problem it not only drops the variable, it drops the cases that were completely determined.

If none of this seems to address the problem - then consider the next FAQ:

gologit2 is very slow and/or does not converge/and/or produces implausible estimates.  A couple of people have written to me with problems like this.  Often they have a large number of variables and/or cases.  Since I don't have their data it is hard for me to tell if there actually is a problem with the program or whether they need to be more patient or whether their model is problematic.  Here are several things you can try.

gologit2 y x1 x2 x3, pl(x1) difficult

gologit2 y x1 x2 x3, pl(x1) technique(nr bhhh dfp bfgs)