Soc 593 Statistics II
Spring 2001
Assignment 2 : OLS Regression

Due: Wed. Jan 31, 2001

Part I. Calculations: Bivariate Regression

Some statistics in this Stata output are missing. Use the equations you know to calculate the missing statistics numbered (1) through (4):

. sum X Y

Variable |     Obs        Mean   Std. Dev.       Min        Max
---------+----------------------------------------------------
       X |       8    12.69375   14.01918        .25         40
       Y |       8      47.625   37.84531         10        126

. regress Y X, beta

  Source |       SS       df       MS                  Number of obs =       8
---------+------------------------------               F(  1,     6) =  162.96
   Model |  9669.83254     1         (1)               Prob > F      =  0.0000
Residual |  356.042455     6  59.3404092               R-squared     =     (2)
---------+------------------------------               Adj R-squared =  0.9586
   Total |   10025.875     7  1432.26786               Root MSE      =  7.7033

------------------------------------------------------------------------------
   birds |      Coef.   Std. Err.       t     P>|t|                       Beta
---------+--------------------------------------------------------------------
    area |   2.651171   .2076843        (3)   0.000                        (4)
   _cons |   13.97169    3.79046      3.686   0.010                          .
------------------------------------------------------------------------------

After filling in the missing statistics, use all the information you have and do the following:


Part II. Multiple Regression

Data: K:\nd.edu\user22\yli\Public\593sp01\Data\hamilton\states90.dta

You are interested in examining the determinants of state median family income.  In the data set give above, you have information on such variables: INCOME (state median family income), COLLEGE (percent of state population over 25 years of age with a bachelor's degree or above), METRO (percent of population living in metropolitan area), and REGION (geographic region).  In Stata, use multiple regression technique to examine the effects of COLLEGE, METRO, and REGION on INCOME.

[Note: You will need to 1) eliminate Washington, DC from your analysis; and 2) dummy code REGION.]

Your write-up should include (not necessarily in this order):

Attach your Stata log file at the end.


Some Stata commands you need to know for this homework:

drop                  drop cases or variables
tab var, gen (var)    generate dummy variables
regress Y X1 X2       regress variable Y on X1 and X2