CONTENTS
CHAPTER 1 INTRODUCTION: REGRESSION ANALYSIS = 1
1.1 Regression models = 3
1.2 Formal uses of regression analysis = 5
1.3 The data base = 6
References = 7
CHAPTER 2 THE SIMPLE LINEAR REGRESSION MODEL = 8
2.1 The model description = 8
2.2 Assumptions and interpretation of model parameters = 9
2.3 Least squares formulation = 12
2.4 Maximum likelihood estimation = 20
2.5 Partioning total variability = 22
2.6 Tests of hypothesis on slope and intercept = 26
2.7 Simple regression through the origin (Fixed intercept) = 33
2.8 Quality of fitted model = 37
2.9 Confidence intervals on mean response and prediction intervals = 41
2.10 Simultaneous inference in simple linear regression = 47
2.11 A complete annotated computer printout = 56
2.12 A look at residuals = 57
2.13 Both x and y random = 66
Exercises = 72
References = 80
CHAPTER 3 THE MULTIPLE LINEAR REGRESSION MODEL = 82
3.1 Model description and assumptions = 82
3.2 The general linear model and the least squares procedure = 85
3.3 Properties of least squares estimators under ideal conditions = 91
3.4 Hypothesis testing in multiple linear regression = 95
3.5 Confidence intervals and prediction intervals in multiple regressions = 112
3.6 Data with repeated observations = 116
3.7 Simultaneous inference in multiple regression = 120
3.8 Multicollinearity in multiple regression data = 123
3.9 Quality fit, quality prediction, and the HAT matrix = 133
3.10 Categorical or indicator variables (Regression models and ANOVA models) = 135
Exercises = 153
References = 163
CHAPTER 4 CRITERIA FOR CHOICE OF BEST MODEL = 164
4.1 Standard criteria for comparing models = 165
4.2 Cross validation for model selection and determination of model performance = 167
4.3 Conceptual predictive criteria (The Cp = statistic) = 178
4.4 Sequential variable selection procedures = 185
4.5 Further comments and all possible regressions = 193
Exercises = 199
References = 206
CHAPTER 5 ANALYSIS OF RESIDUALS = 209
5.1 Information retrieved from residuals = 210
5.2 Plotting of residuals = 211
5.3 Studentized residuals = 217
5.4 Relation to standardized PRESS residuals = 220
5.5 Detection of outliers = 221
5.6 Diagnostic plots = 231
5.7 Normal residual plots = 242
5.8 Further comments on analysis of residuals = 244
Exercises = 244
References = 248
CHAPTER 6 INFLUENCE DIAGNOSTICS = 249
6.1 Sources of influence = 250
6.2 Diagnostics: Residuals and the HAT matrix = 251
6.3 Diagnostics that determine extent of influence = 257
6.4 Influence on performance = 267
6.5 What do we do with high influence points? = 270
Exercises = 272
References = 273
CHAPTER 7 NONSTANDARD CONDITIONS, VIOLATIONS OF ASSUMPTIONS, AND TRANSFORMATIONS = 275
7.1 Heterogeneous variance: Weighted least squares = 277
7.2 Problem with correlated errors (Autocorrelation) = 287
7.3 Transformations to improve fit and prediction = 293
7.4 Regression with a binary response = 315
7.5 Further developments in models with a discrete response (Poisson regression) = 332
7.6 Generalized linear models = 339
7.7 Failure of normality assumption: Presence of outliers = 348
7.8 Measurement errors in the regressor variables = 357
Exercises = 358
References = 365
CHAPTER 8 DETECTING AND COMBATING MULTiCOLLINEARlTY = 368
8.1 Multicollinearity diagnostics = 369
8.2 Variance proportions = 371
8.3 Further topics concerning multicollinearity = 379
8.4 Alternatives to least squares in cases of multicollinearity = 389
Exercises = 419
References = 422
CHAPTER 9 NONLINEAR REGRESSION = 424
9.1 Nonlinear least squares = 425
9.2 Properties of the least squares estimators = 425
9.3 The Gauss-Newton procedure for finding estimates = 426
9.4 Other modifications of the Gauss-Newton procedure = 433
9.5 Some special classes of nonlinear models = 436
9.6 Further considerations in nonlinear regression = 440
9.7 Why not transform data to linearize? = 444
Exercises = 445
References = 449
APPENDIX A SOME SPECIAL CONCEPTS IN MATRIX ALGEBRA = 452
A.1 Solutions to simultaneous linear equations = 452
A.2 Quadratic form = 454
A.3 Eigenvalues and eigenvectors = 456
A.4 The inverses of a partitioned matrix = 458
A.5 Sherman-Morrison-Woodbury theorem = 459
References = 460
APPENDIX B SOME SPECIAL MANIPULATIONS = 461
B.1 Unbiasedness of the residual mean square = 461
B.2 Expected value of residual sum of squares and mean square for an underspecified model = 462
B.3 The maximum likelihood estimator = 464
B.4 Development of the PRESS statistic = 465
B.5 Computation of <TEX>$$s_{-i}$$</TEX> = 467
B.6 Dominance of a residual by the corresponding model error = 468
B.7 Computation of influence diagnostics = 468
B.8 Maximum likelihood estimator in the nonlinear model = 470
B.9 Taylor series = 470
B.10 Development of the <TEX>$$C_k$$</TEX>-statistic = 471
References = 473
APPENDIX C STATISTICAL TABLES = 474
INDEX = 486
CHAPTER 1 INTRODUCTION: REGRESSION ANALYSIS = 1
1.1 Regression models = 3
1.2 Formal uses of regression analysis = 5
1.3 The data base = 6
References = 7
CHAPTER 2 THE SIMPLE LINEAR REGRESSION MODEL = 8
2.1 The model description = 8
2.2 Assumptions and interpretation of model parameters = 9
2.3 Least squares formulation = 12
2.4 Maximum likelihood estimation = 20
2.5 Partioning total variability = 22
2.6 Tests of hypothesis on slope and intercept = 26
2.7 Simple regression through the origin (Fixed intercept) = 33
2.8 Quality of fitted model = 37
2.9 Confidence intervals on mean response and prediction intervals = 41
2.10 Simultaneous inference in simple linear regression = 47
2.11 A complete annotated computer printout = 56
2.12 A look at residuals = 57
2.13 Both x and y random = 66
Exercises = 72
References = 80
CHAPTER 3 THE MULTIPLE LINEAR REGRESSION MODEL = 82
3.1 Model description and assumptions = 82
3.2 The general linear model and the least squares procedure = 85
3.3 Properties of least squares estimators under ideal conditions = 91
3.4 Hypothesis testing in multiple linear regression = 95
3.5 Confidence intervals and prediction intervals in multiple regressions = 112
3.6 Data with repeated observations = 116
3.7 Simultaneous inference in multiple regression = 120
3.8 Multicollinearity in multiple regression data = 123
3.9 Quality fit, quality prediction, and the HAT matrix = 133
3.10 Categorical or indicator variables (Regression models and ANOVA models) = 135
Exercises = 153
References = 163
CHAPTER 4 CRITERIA FOR CHOICE OF BEST MODEL = 164
4.1 Standard criteria for comparing models = 165
4.2 Cross validation for model selection and determination of model performance = 167
4.3 Conceptual predictive criteria (The Cp = statistic) = 178
4.4 Sequential variable selection procedures = 185
4.5 Further comments and all possible regressions = 193
Exercises = 199
References = 206
CHAPTER 5 ANALYSIS OF RESIDUALS = 209
5.1 Information retrieved from residuals = 210
5.2 Plotting of residuals = 211
5.3 Studentized residuals = 217
5.4 Relation to standardized PRESS residuals = 220
5.5 Detection of outliers = 221
5.6 Diagnostic plots = 231
5.7 Normal residual plots = 242
5.8 Further comments on analysis of residuals = 244
Exercises = 244
References = 248
CHAPTER 6 INFLUENCE DIAGNOSTICS = 249
6.1 Sources of influence = 250
6.2 Diagnostics: Residuals and the HAT matrix = 251
6.3 Diagnostics that determine extent of influence = 257
6.4 Influence on performance = 267
6.5 What do we do with high influence points? = 270
Exercises = 272
References = 273
CHAPTER 7 NONSTANDARD CONDITIONS, VIOLATIONS OF ASSUMPTIONS, AND TRANSFORMATIONS = 275
7.1 Heterogeneous variance: Weighted least squares = 277
7.2 Problem with correlated errors (Autocorrelation) = 287
7.3 Transformations to improve fit and prediction = 293
7.4 Regression with a binary response = 315
7.5 Further developments in models with a discrete response (Poisson regression) = 332
7.6 Generalized linear models = 339
7.7 Failure of normality assumption: Presence of outliers = 348
7.8 Measurement errors in the regressor variables = 357
Exercises = 358
References = 365
CHAPTER 8 DETECTING AND COMBATING MULTiCOLLINEARlTY = 368
8.1 Multicollinearity diagnostics = 369
8.2 Variance proportions = 371
8.3 Further topics concerning multicollinearity = 379
8.4 Alternatives to least squares in cases of multicollinearity = 389
Exercises = 419
References = 422
CHAPTER 9 NONLINEAR REGRESSION = 424
9.1 Nonlinear least squares = 425
9.2 Properties of the least squares estimators = 425
9.3 The Gauss-Newton procedure for finding estimates = 426
9.4 Other modifications of the Gauss-Newton procedure = 433
9.5 Some special classes of nonlinear models = 436
9.6 Further considerations in nonlinear regression = 440
9.7 Why not transform data to linearize? = 444
Exercises = 445
References = 449
APPENDIX A SOME SPECIAL CONCEPTS IN MATRIX ALGEBRA = 452
A.1 Solutions to simultaneous linear equations = 452
A.2 Quadratic form = 454
A.3 Eigenvalues and eigenvectors = 456
A.4 The inverses of a partitioned matrix = 458
A.5 Sherman-Morrison-Woodbury theorem = 459
References = 460
APPENDIX B SOME SPECIAL MANIPULATIONS = 461
B.1 Unbiasedness of the residual mean square = 461
B.2 Expected value of residual sum of squares and mean square for an underspecified model = 462
B.3 The maximum likelihood estimator = 464
B.4 Development of the PRESS statistic = 465
B.5 Computation of <TEX>$$s_{-i}$$</TEX> = 467
B.6 Dominance of a residual by the corresponding model error = 468
B.7 Computation of influence diagnostics = 468
B.8 Maximum likelihood estimator in the nonlinear model = 470
B.9 Taylor series = 470
B.10 Development of the <TEX>$$C_k$$</TEX>-statistic = 471
References = 473
APPENDIX C STATISTICAL TABLES = 474
INDEX = 486