Book Review: OpenIntro Statistics, 4th ed.
Title: OpenIntro Statistics (4th ed)
Authors: David M Diez, Mine Çetinkaya-Rundel, Christopher D Barr
Publisher: OpenIntro, Inc.
ISBN-13: 978-1943450077
Formats available: Paperback & PDF
The fourth edition of Diez et al.’s OpenIntro Statistics (OS4) was released in May 2019. As the name suggests, OS4 is an open education resource, meaning that the source material is published under an open license and is free to use and distribute. The low cost to students was the primary reason I chose the previous edition of this text for an undergraduate introductory biostatistics course I developed in 2016.
In many respects, OS4 is a conventional introductory statistics textbook. Initial chapters discuss data and sampling, descriptive statistics, probability, and random variables. These foundations are followed by three chapters on statistical inference, introducing the ideas of sampling variation, confidence intervals, and hypothesis tests. The next two chapters focus on specific methods for categorical and numerical data such as z-tests, chi-squared tests, t-tests, and ANOVA. The final two chapters describe regression, with one chapter devoted to correlation and simple linear regression and the second devoted to multiple and logistic regression. These topics, particularly the last chapter, are a bit cursory. For example, in the section on model selection, the text describes forward selection and backward elimination through adjusted R-squared, without a mention of the potential limitations of these methods or any alternatives. The section on logistic regression is also brief and the interpretation of logistic regression coefficients as adjusted log odds ratios is never introduced. The discussions of these methods are lacking some nuance; however, the book is targeted for introductory courses and many courses - mine included - don’t even get to this material.
As with most introductory textbooks, there are exercises at the end of each section and chapter. Solutions to odd-numbered problems are in an appendix. The exercises are well-written and interesting, with many drawing on recent research or news stories for context. There are enough health-related exercises that someone teaching a biostatistics course could assign only topic-relevant exercises (as opposed to using examples in business marketing or engineering) without much effort. The biggest drawback in the exercises is that most focus on interpreting statistical context and results without much applied calculation. While this may not be problematic depending on the focus of a course, in my course students are expected to learn how to perform these methods in software and the textbook exercises do not provide much practice in that area. OS4 does include online “lab” supplements (also free) which require more applied work.
In over ten years of teaching introductory statistics I have not yet found a textbook that suits me perfectly. OS4 still does not do so, but it comes as close as, or closer than, any other book I have found. It is very well written and straightforward for students to follow. For these reasons, along with the low cost for students, I suggest those searching for a new introductory textbook to consider it.
The OpenIntro website offers numerous resources, including links to the PDF (hosted by Lean Pub; $15 suggested price, free minimum) and softcover (Amazon; $20 B&W/$60 color list price) versions of the text; R and SAS labs; and resources for MyOpenMath, a free service for online course materials such as homework; among other extras. On the site you’ll also find three similar books by the same authors: one for AP statistics, one incorporating randomization and simulation-based inference, and one (preliminary edition) for biostatistics and health sciences.