Regression with Social Data

Regression with Social Data
Modeling Continuous and Limited
Response Variables

Regression with Social Data: Regression models, in some form or another, are ubiquitous in social data analysis. Although classic linear regression assumes a continuous dependent variable, later incarnations of the technique allowed the response to take on a variety of more limited forms: binary, multinomial, truncated, censored, strictly integer, and others. Increasingly, regression texts are incorporating some limited-dependent-variable techniques—typically, binary response models—along with classic linear regression coverage. However, other than in econometrics texts, it is rare to find regression models for the full spectrum of continuous and limited response variables treated in one volume. This monograph aims to provide just such a treatment.

Regression models, in some form or another, are ubiquitous in social data analysis. Although classic linear regression assumes a continuous dependent variable, later
incarnations of the technique allowed the response to take on a variety of more
limited forms: binary, multinomial, truncated, censored, strictly integer, and others.
Increasingly, regression texts are incorporating some limited-dependent-variable
techniques—typically, binary response models—along with classic linear regression
in their coverage. However, other than in econometrics texts, it is rare to find regression models for the full spectrum of continuous and limited response variables treated
in one volume. This monograph aims to provide just such a treatment.
In particular, the first six chapters of the book parallel the coverage of the typical monograph on linear regression: an introduction to regression modeling
(Chapter 1), simple linear regression (Chapter 2), multiple linear regression
(Chapter 3), regression with categorical predictors (Chapter 4), regression with
nonlinear effects (Chapter 5), and finally, consideration of advanced topics such
as generalized least squares, omitted-variable bias, influence diagnostics, collinear it diagnostics, and alternatives to ordinary least squares for heavily collinear data
(Chapter 6). The second half, however, considers models for dependent variables
that are limited in one way or another. Examples of such data are event counts, categorical responses, truncated responses, or censored responses. The topic coverage
in the second half of the book is, therefore: binary response models (Chapter 7),
multinomial response models (Chapter 8), censored and truncated regression
(Chapter 9), regression models for count data (Chapter 10), an introduction to survival

analysis (Chapter 11), and multistate, multiepisode, and interval-censored survival
models (Chapter 12).
The book is intended both as a reference for data analysts working primarily with
social data and as a graduate-level text for students in the social and behavioral sciences. As a text, it is most suited to a two-course sequence in regression. As an example, I normally employ the material in Chapters 1 through 7 for a doctoral-level
course on regression analysis. This course focuses primarily on linear regression but
includes an introduction to binary response models. In a more advanced course on
regression with limited dependent variables, I use Chapters 2 through 4 to review the
multiple linear regression model and then use Chapters 7 through 12 for the heart
of the course. On the other hand, a survey of regression-like models using the generalized linear model as the guiding framework might conceivably employ Chapters 1
through 5, and then 7 through 10. Other chapter combinations are also possible.

By Dr. SK