An introduction to multivariate statistical analysis.

*(English)*Zbl 0083.14601
Wiley Publications in Statistics. New York: John Wiley & Sons, Inc.; london: Chapman & Hall, Ltd. XII, 374 p. (1958).

This very useful book pulls together a large amount of information scattered over the literature: the bibliography (which the reviewer found not quite complete in a few subjects) contains some 450 titles. Even so, its author correctly remarks that the book does not treat certain topics in the field. Of the simpler topics that are omitted we mention: multivariate Chebyshev inequality, the theory of so-called categorical data, distribution-free methods. The author justifies the almost exclusive use of the normal distribution by the remark (which the reviewer would like to have seen corroborated) that the multivariate normal distributions have been found to be sufficiently close approximations to the populations, and by the fact that normal theory is amenable to exact mathematical treatment. The chapter headings are as follows:

I. Introduction; II. the multivariate normal distribution (theorem 2.6.3 on the derivation of the density from the characteristic function omits the condition that the latter should be absolutely integrable, pp. 25–27 give a careful discussion of the singular normal distribution);

III. estimation of the mean vector and the covariance matrix (maximum likelihood estimation, with some nice general properties of maximum likelihood estimators, and confidence regions);

IV. the distribution and uses of sample correlation coefficients (proofs are neatly done by means of conditional distributions and do not depend an Wishart distribution, Hotelling’s very useful alternative form of the distribution of the correlation coefficient is given in some detail, and there is a neat argument leading to ,,Fisher’s z” although the rapidity of its approach to normality is not explained);

V. the generalized \(T^2\)-statistics (with, amongst other things, a section an non-central chi-square and \(F\)-distributions, and on Bennett’s extension of ScheffĂ©’s solution of the Behrens-Fisher problem);

VI. classification of observations (Bayes and minimax procedures, a large part of this chapter holds for general distributions);

VII. the distribution of the sample covariance matrix and the sample generalized variance;

VIII. testing the general linear hypothesis, analysis of variance (a multivariate analogue of \(F\)-test, an extensive discussion of Box’s theory of the asymptotic expansion of its distribution, an illuminating account of invariant tests of the linear hypothesis);

IX. testing independence of sets of variables;

X. testing hypotheses of equality of covariance matrices and equality of mean vectors and covariance matrices (much of the distribution theory of Chs. IX and X is a further development and application of the distribution theory in Ch. VIII);

XI and XII give the definition of principal components, canonical correlations, canonical variables and their estimators, while XIII shows that the distribution theory of Chs. VIII through XII largely depends an the distribution of the roots of \(| A - \lambda B| = 0\) (\(A\) and \(B\) Wishart matrices) and a few related equations (it might be worth-while to inquire if the treatment of these Chapters might not be shortened and unified by means of an earlier presentation of the contents of Ch. XIII);

XIV, a (very short) review of some other work in multivariate analysis (amongst other things: factor analysis, stochastic equations, time series analysis).

Finally there is a useful appendix an matrix algebra. The various chapters have short introductory sections which give an lucid account on what the chapter will be about, they are concluded by several pages of problems ranging from elementary to quite advanced.

Throughout the book the author gives much attention to geometric interpretations (e. g. in connection with ordinary and multiple correlation coefficients, \(T^2\)-statistic (p. 104), sample generalized variance (p. 167–170), estimation of regression coefficients (p. 181)), and to optimum properties of the tests discussed, in particular to invariance properties (e. g. in connection with ordinary and multiple correlation coefficients, regression coefficients, \(T^2\)-test, testing equality of covariance matrices, and testing independence of sets of variates). The likelihood ratio criterion is used as an elegant, unifying principle to derive test statistics.

It is only natural that in a book of this scope some desires go unfulfilled. For instance, the reviewer feels that the characteristic function has been too much neglected as an elegant tool by which a number of proofs could have been shortened considerably. Some statements may puzzle the reader, so on p. 17 the statement that the covariance matrix is positive definite is not proved, neither is the statement (p. 22) that the implication of independence by lack of correlation depends an the assumption of normality. Some proofs could be made simpler or more direct, so on p. 136 the maximization of the function of \(d\) which leads to Mahalanobis’ generalized distance, and on p. 164 the proof of Cochran’s theorem. There is no difference between the notations for random matrices and random vectors; this will sometimes lead to confusion.

Finally, although there are several useful sections an numerical computations (e. g. in connection with regression coefficients – Doolittle method and pivotal condensation shown to be essentially the same –, \(T^2\)-statistic, and latent roots and vectors), little attention seems to have been devoted to the effects of ill-conditioned matrices.

On the whole, however, this is a very useful and stimulating book.

I. Introduction; II. the multivariate normal distribution (theorem 2.6.3 on the derivation of the density from the characteristic function omits the condition that the latter should be absolutely integrable, pp. 25–27 give a careful discussion of the singular normal distribution);

III. estimation of the mean vector and the covariance matrix (maximum likelihood estimation, with some nice general properties of maximum likelihood estimators, and confidence regions);

IV. the distribution and uses of sample correlation coefficients (proofs are neatly done by means of conditional distributions and do not depend an Wishart distribution, Hotelling’s very useful alternative form of the distribution of the correlation coefficient is given in some detail, and there is a neat argument leading to ,,Fisher’s z” although the rapidity of its approach to normality is not explained);

V. the generalized \(T^2\)-statistics (with, amongst other things, a section an non-central chi-square and \(F\)-distributions, and on Bennett’s extension of ScheffĂ©’s solution of the Behrens-Fisher problem);

VI. classification of observations (Bayes and minimax procedures, a large part of this chapter holds for general distributions);

VII. the distribution of the sample covariance matrix and the sample generalized variance;

VIII. testing the general linear hypothesis, analysis of variance (a multivariate analogue of \(F\)-test, an extensive discussion of Box’s theory of the asymptotic expansion of its distribution, an illuminating account of invariant tests of the linear hypothesis);

IX. testing independence of sets of variables;

X. testing hypotheses of equality of covariance matrices and equality of mean vectors and covariance matrices (much of the distribution theory of Chs. IX and X is a further development and application of the distribution theory in Ch. VIII);

XI and XII give the definition of principal components, canonical correlations, canonical variables and their estimators, while XIII shows that the distribution theory of Chs. VIII through XII largely depends an the distribution of the roots of \(| A - \lambda B| = 0\) (\(A\) and \(B\) Wishart matrices) and a few related equations (it might be worth-while to inquire if the treatment of these Chapters might not be shortened and unified by means of an earlier presentation of the contents of Ch. XIII);

XIV, a (very short) review of some other work in multivariate analysis (amongst other things: factor analysis, stochastic equations, time series analysis).

Finally there is a useful appendix an matrix algebra. The various chapters have short introductory sections which give an lucid account on what the chapter will be about, they are concluded by several pages of problems ranging from elementary to quite advanced.

Throughout the book the author gives much attention to geometric interpretations (e. g. in connection with ordinary and multiple correlation coefficients, \(T^2\)-statistic (p. 104), sample generalized variance (p. 167–170), estimation of regression coefficients (p. 181)), and to optimum properties of the tests discussed, in particular to invariance properties (e. g. in connection with ordinary and multiple correlation coefficients, regression coefficients, \(T^2\)-test, testing equality of covariance matrices, and testing independence of sets of variates). The likelihood ratio criterion is used as an elegant, unifying principle to derive test statistics.

It is only natural that in a book of this scope some desires go unfulfilled. For instance, the reviewer feels that the characteristic function has been too much neglected as an elegant tool by which a number of proofs could have been shortened considerably. Some statements may puzzle the reader, so on p. 17 the statement that the covariance matrix is positive definite is not proved, neither is the statement (p. 22) that the implication of independence by lack of correlation depends an the assumption of normality. Some proofs could be made simpler or more direct, so on p. 136 the maximization of the function of \(d\) which leads to Mahalanobis’ generalized distance, and on p. 164 the proof of Cochran’s theorem. There is no difference between the notations for random matrices and random vectors; this will sometimes lead to confusion.

Finally, although there are several useful sections an numerical computations (e. g. in connection with regression coefficients – Doolittle method and pivotal condensation shown to be essentially the same –, \(T^2\)-statistic, and latent roots and vectors), little attention seems to have been devoted to the effects of ill-conditioned matrices.

On the whole, however, this is a very useful and stimulating book.

Reviewer: H. R. van der Paart.

##### MSC:

62-01 | Introductory exposition (textbooks, tutorial papers, etc.) pertaining to statistics |