Statistics with Stata 5 Data Files

Here are all the files used in Statistics with Stata 5.
They are compressed into one zip file that you must de-compress using WINZIP: Click here to download sws5.zip.

Note: The zipped file contains some data that we will not use in this course. Individual files are also available in decompressed form in my public directory: K:\nd.edu\user19\dmyers\Public\hamilton\

Datasets are listed below by the chapter where they are first introduced and documented.


Chapter 1: Stata and Stata resources

lofoten.dta
page 3.
Self management of fisherman on Norway's arctic Lofoten Islands.
Jentoft and Kristoffersen (1989).
10 obs., 5 vars.; (512 bytes)

Chapter 2: Data management

canada1.dta
page 22.
Data on Canada and its provinces.
13 obs., 5 vars.; (817 bytes)
canada2.dta
page 29.
Data on Canada and its provinces. Includes string variable.
13 obs, 7 vars.; (1,147 bytes)
canada.raw
page 37.
Data on Canada and its provinces in ASCII form.
(approx. 640 bytes)
nfresour.raw
page 39.
Natural resource production in Newfoundland.
ASCII data that is run together.
(approx. 120 bytes)
newf1.dta
page 41.
Newfoundland population data.
5 obs., 2 vars.; (225 bytes)
newf2.dta
page 41.
More Newfoundland population data and population unemployed.
6 obs, 3 vars.; (320 bytes)
newf4.dta
Page 43.
Newfoundland births and divorces data.
15 obs., 3 vars.; (350 bytes)
growth1.dta
Page 46.
Recent population growth in 5 eastern provinces of Canada.
5 obs., 5 vars.; (537 bytes)
growth3.dta
Page 50.
Same as growth1.dta, but data in long form.
20 obs., 3 vars.; (462 bytes)
nfschool.dta
Page 53.
Survey of 1,381 rural Newfoundland high-school students.
6 obs., 3 vars.; (316 bytes)

Chapter 3: Graphs

states90.dta
Page 62.
Environment and education measures in 50 U.S. States and D.C.
51 obs., 21 vars.; (5,358 bytes)
whales.dta
page 67.
Time-series on worldwide catch of blue whales.
19 obs., 4 vars.; (477 bytes)
nhwater.dta
page 70.
Daily water consumption for Milford, New Hampshire, during the first half of 1983.
212 obs., 4 vars.; (1,597 bytes)
akethnic.dta
page 78.
Ethnic composition of Alaska.
3 obs., 7 vars.; (613 bytes)
micro.dta
page 80.
Measurements of microcomputer computing speed.
51 obs., 9 vars.; (3,197 bytes)
stats.dta
page 83.
Political party preference of 43 statistics students.
43 obs., 6 vars.; (978 bytes)
qual1.dta
page 90.
Quality-control dataset.
16 obs., 3 vars.; (308 bytes)
qual2.dta
page 91.
Quality-control dataset for illustrating rchart and xchart.
25 obs., 4 vars.; (725 bytes)

Chapter 4: Tables and summary statistics

vttown.dta
page 96.
Data on 153 residents of a town in Vermont concerning trace amounts of toxic chemicals discovered in the town's water supply.
153 obs., 7 obs.; (1,719 bytes)
sextab2.dta
page 111.
British survey on sexual behavior. (Johnson et al. 1992)
48 obs; 4 vars.; (721 bytes)
college1.dta
page 112.
Information on 11 U.S. colleges.
11 obs., 5 vars.; (775 bytes)

Chapter 5: ANOVA and other comparison methods

writing.dta
page 116.
Data collected to evaluate a college writing course that employed microcomputers for word processing (Nash and Schwart 1987).
24 obs., 9 vars., (1,118 bytes)
student1.dta
page 119.
Survey of college undergraduates (Ward and Ault 1990).
243 obs., 19 vars., (6,651 bytes)

Chapter 6: Linear regression analysis

(uses states90.dta introduced in Chapter 3, page 62.)


Chapter 7: Regression diagnostics

co2.dta
page 164.
Global warming (adapted from Brown, Kane, and Roodman 1994).
29 obs., 3 vars.; (550 bytes)

Chapter 8: Fitting curves

missile2.dta
page 187.
Cold War data from MacKenzie (1990)
48 obs., 6 vars.; (1,719 bytes)
ice.dta
page 191.
Subset of Greenland Ice Sheet Project 2 (Mayewski, Holdsworth, et al. 1993 and Mayewski, Meeker, et al. 1994) concerning 100,000 years of climate history.
272 obs., 3 vars.; (5,156 bytes)
tornado.dta
page 193.
U.S. tornados, 1916 to 1986 (Council on Environmental Quality 1988).
71 obs., 4 vars.; (1,035 bytes)
nations.dta
page 195.
Data on 109 countries.
109 obs., 15 vars.; (4,637 bytes)
nonlin2.dta
page 200.
Artificial data to demonstrate nonlinear regression.
100 obs., 5 vars.; (2,390 bytes)
lichen.dta
page 203.
Measurements of lichen growth observed on the Arctic island of Spitsbergen (from Wernew, 1990).
11 obs., 8 vars.; (1,113 bytes)

Chapter 10: Logistic regression

shuttle.dta
page 228.
Data on first 25 flights of U.S. space shuttles
(Report of the Presidential Commission on the Space Shuttle Challenger Accident, 1986).
25 obs, 6 vars.; (934 bytes)
shuttle2.dta
page 230. (but not mentioned)
Same as shuttle.dta but includes any variables as
described on page 230.
25 obs, 8 vars.; (1,139 bytes)
dover.dta
page 244.
Environmental-issues survey of registered voters in Dover, New Hampshire.
150 obs, 9 vars.; (2,929 bytes)

Chapter 11: Survival analysis and event-count models

aids.raw
page 253.
51 individuals diagnosed with HIV (Selvin 1995, 453).
Also see aids2.dta below for this data in Stata format.
(approx. 2,250 bytes)
diskdriv.dta
page 256.
Fictional data on test of 25 disk drives.
6 obs., 3 vars.; (284 bytes)
smoking.dta
page 259.
234 former smokers attempting to quit.
234 obs., 8 vars.; (3,425 bytes)
smoking0.dta
page 259.
234 former smokers attempting to quit.
(same as smoking.dta but with the Stata -stset- command performed.)
234 obs., 8 vars.; (3,504 bytes)
aids2.dta
page 261.
51 individuals diagnosed with HIV (Selvin 1995, 453).
51 obs., 5 vars.; (771 bytes)
heart.dta
page 263.
Survival-time data on 35 patients with very high cholesterol levels (Selvin 1995, 436).
35 obs., 8 vars.; (1,085 bytes)
oakridge.dta
page 271.
Data on radiation exposure and cancer deaths among workers at Oak Ridge National Laboratory.
56 obs., 4 vars.; (799 bytes)
oakridg2.dta
page 277.
Same as oakridge.dta but with some added dummy variables created.
56 obs, 13 vars.; (1,888 bytes)