
NOTE: These are the Spring 2008 course notes for the second semester of my graduate statistics courses. The notes for the first semester course, Sociology 592, are also available. These pages make extensive use of Stata and SPSS. If you are mostly interested in learning how to use Stata, the Stata Highlights page lists several of the most relevant handouts from both courses. Some pages are more "stand alone" than others, so adjacent handouts may help clear up any questions you have.
These pages will be updated whenever I complete another session of the course, and possibly sooner. Older notes and/or notes from the current semester when the course is being taught can be found here.
Feel free to email Richard Williams if you have comments or suggestions.
The following special types of files are used on this web page:
In addition, some files are in zipped (compressed) format. If you don't have an unzipping program, you can use the free PC Magazine PCDEZIP utility.
Finally, please note that the answer keys for the exams and homework differ in the amount of detail provided. I sometimes give very detailed answers, other times the answers are much more minimal (and given the information provided I assume the student can figure out the rest). Students should always aim for complete answers in their homework and exams. In particular, it is hard to give partial credit when it is not clear why an error was made.
Readings Packet (You need a Notre Dame NETID to access these)
Useful sites for learning about Stata and SPSS
UCLA's Statistical Computing Resources RW Suggestions for Using Stata at Notre Dame UCLA's SPSS Starter Kit Resources for learning Stata UCLA - How does Stata compare with SAS and SPSS? The Stata User Support Page Ben Jann's estout/esttab support page (esttab & estout are great for formatting output from Stata)
PART I: In this section, we briefly review the basics of OLS regression. We talk about some of the most common issues (measurement error, missing data, violations of OLS assumptions) encountered in regression analysis.
Using SPSS for OLS Regression (Read on your own & ask questions in Lab as needed)
reg01.sav - Data file used in the SPSS Regression handout
Using Stata 9 for OLS Regression (Read on your own & ask questions in Lab as needed)
reg01.dta - Data file used in the Stata Regression handout
Homework # 1 (Due Jan 30)
sphrd.dta (Stata data file required for HW # 1)
Homework # 1 Answer Key
mulicoll.dta - Stata data file used in the Multicollinearity handout
md.dta - Stata data file used in the Missing Data handout
Homework # 2 (Due Feb 6)
Homework # 2 Answer Key
missing-ak.sps (adds some additional analyses to the earlier program)
hw02-III.do (Stata program for problem 3)
Measurement Error Example (Supplemental)
Scale Construction (Very Brief Overview)
anomia.dta - Stata data file used in the Scale Construction handout
anomia.sav - SPSS data file used in the Scale Construction handout
outliers.dta - Stata data file used in the Outliers handout
outliers.sav - SPSS data file used in the Outliers handout
Also Recommended: Robert Yaffee's Robust Regression Modeling with Stata (This is 93 pages long but it is basically overhead slides and hence much shorter than it at first appears to be. Nice discussions of how to deal with outliers and with heteroskedasticity.)
reg01.dta - Stata data file used in the Heteroskedasticity handout
Serial Correlation (Very Brief Overview)
Also Recommended: UCLA's Regression Diagnostics Page. Shows a lot of the techniques that are available with Stata for detecting outliers, heteroskedasticity, multicollinearity, serial correlation and other problems with regression models.
Homework # 3 (Due Feb 13)
Homework # 3 Answer Key
resales.do (Stata program for the real estate sales problem)
resales.sps (Spss Program for the real estate sales problem)
Sample first exams and answer keys
PART II: This section shows how regression can be used to properly specify a causal model. We begin by introducing "the logic of causal order," which lets us understand the different kinds of causal relationships that might be present between variables. Common model mis-specifications are then addressed (e.g. omitted variables, extraneous variables, variables with nonlinear effects). We discuss how to choose between alternative causal models. Finally, we introduce path analysis as a method for causal modeling.
tbklogic.zip These are toolbook presentations which we will go over in class.
![]()
[Optional] If you also want more conventional notes for the above material, click here and here. In class, I'll only use these notes if there is a problem with the Toolbook presentation.
Local of Causal Order, Handout 1: Variable Naming
Local of Causal Order, Handout 2: Sample Problem, Logic of Causal Order
Local of Causal Order, Handout 3: Suppressor Effects
Local of Causal Order, Handout 4: Interaction Effects
Local of Causal Order, Handout 5: Another Sample Problem for the Logic of Causal Order
The Logic of Causal Order, Closing Comments
Homework # 4 (due Feb 27)
Homework # 4 Answer Key
Imposing and Testing Equality Constraints in Models
Group Comparisons: Differences in Composition Versus Differences in Models and Effects
Group Comparisons: Using "What If" Scenarios to Decompose Differences Across Groups
blwh.dta and goodpay.dta - Stata data files used in the constraints & group comparisons handouts
Homework # 5 (Due March 12)
Homework # 5 Answer Key
Interaction Effects and Group Comparisons
Models for Group Comparisons - Summary
blwh.dta - Stata data file used in the Interaction Effects handout
Interpreting Interaction Effects; Interaction Effects and Centering
drinking.dta - Stata data file used in the Interpreting Interaction Effects handout
Discussion Questions for Group Comparisons and Interaction Effects (Cover these on your own if we don't get to them in class)
Interactions Between Continuous Variables
Homework # 6 (Due March 19)
Homework # 6 Answer Key
Introduction to Path Analysis - Highlights
Homework # 7 (Due April 2)
Homework # 7 Answer Key
Sample second exams and answer keys
PART III: Here, we develop path analysis techniques more fully. We talk about more complicated models that cannot be accurately estimated through conventional OLS regression techniques (e.g. nonrecursive models). We also talk about situations where the nature of the data make OLS regression inappropriate (e.g. dichotomous dependent variables) or less than optimal.
Structural Coefficients in Recursive Models/ Evils of Standardization
Computing R Square/ Evils of R Square
Homework # 8 (Due April 16, but you could easily finish it much sooner than that!)
Homework # 8 Answer Key
Logistic Regression I: Problems with the Linear Probability Model (LPM)
Logistic Regression II: The Logistic Regression Model (LRM)
Logistic Regression III: Hypothesis Testing, Comparisons with OLS
Using Stata for Logistic Regression
logist.dta - Stata data file used in the Logistic Regression handout
Homework # 9 (Due April 23)
Homework # 9 Answer Key
shuttle2.dta - Stata data file used in the Ordered Logit and Multinomial Logit handout
nonrecur.dta - Stata data file used in the Nonrecursive Models handout
blwh.dta - Stata data file used in the Manova handout
Extremely Brief Overviews of Event History Analysis and Hierarchical Linear Modeling --
Read Ch. 9 of Paul Allison's Multiple Regression Primer, paying particular attention to section 9.9 (Multilevel Models) and section 9.12 (Event History Analysis)
Homework # 10 (Due April 30)
Homework # 10 Answer Key
Sample final exams and answer keys
Go to Soc 592 Stats 1 Notes Go to Stata Highlights Page
Other materials and answer keys may be available upon request.
