“Bayesian Statistics the Fun Way: Understanding Statistics and Probability with Star Wars, Lego, and Rubber Ducks,” by Will Kurt (2019 No Starch Press) is an excellent introduction to subjects critical to all data scientists. Will Kurt, in fact, is a data scientist! I always advise my data science classes at UCLA to engage these important subjects in order to obtain a well-rounded exposure to disciplines upon which data science is based. I’ve already added this title to my official bibliography of learning resources given to my students.
The books it touted as a beginner’s guide to applying statistics to real-world situations, with exercises that place the rader in fun and familiar hypothetical situations to demonstrate the practical applications of Bayes’ Theorem. I think of the book as a fun toolkit written by a data scientist for newbie data scientists. You’ll find that the tools and techniques make intuitive sense, useful for turning abstract or limited data into practical information. The book has an associated website that includes solutions to the book’s exercises and the end of each chapter.
“Everyone can benefit from thinking about problems in a Bayesian way,” said author Will Kurt. “I wanted this book to be something someone could study on a plane flight and land able to make solid choices that involve probabilities and uncertainty.”
I’ve seen a number of other books that endeavor to introduce these topics, but Kurt’s book is something special in the easy to understand approach taken throughout. There is little mathematics which is entirely appropriate at the level intended for the reader (there’s plenty of time for a more mathematically rigorous treatment down the line). There is a modest amount of R programming code (starting in Chapter 4) gently integrated in the text, along with Appendix A which contains a very brief intro to R (more on this below). Bear in mind, this is not a programming book, but rather a descriptive book.
A Shining Resource for Learning
One way the book shines is its very lucid descriptions of rather technical topics. One of my favorite chapters was: Chapter 6 on Conditional Probability when Bayes Theorem is introduces. It is one of the best overviews I’ve read (certainly a prerequisite to reading Bayes’ original paper from 1763). I also liked Chapter 8 which expands on the use of Bayesian reasoning. And subsequent discussions of parameter estimation and hypothesis testing are very useful for any budding data scientist.
Another way the books shines is with all the simple and familiar use case examples that serve to solidify the reader’s knowledge of the Bayesian way of problem solving. Take for instance these thoroughly worked-out examples: C3PO’s odds of successfully navigating an asteroid field in Star Wars, reasoning about LEGO bricks, the Mystic Seer in the Twilight Zone’s “The Nick of Time” episode, fairness of carnival games (e.g. pool of rubber ducks), etc. Most readers can readily embrace these examples, and this approachability lowers the barrier to the subject.
The following chapters are included:
Part 1: Introduction to Probability
Chapter 1: What Do You Believe and How Do You Change it?
Chapter 2: Measuring Uncertainty
Chapter 3: The Logic of Uncertainty
Chapter 4: Probability Distributions 1
Chapter 5: Probability Distributions 2
Part 2: Bayesian Probability and Prior Probabilities
Chapter 6: Conditional Probability
Chapter 7: Bayes’ Theorem with LEGO
Chapter 8: Posterior, Likelihood, and Prior
Chapter 9: Working with Prior Probability Distributions
Part 3: Parameter Estimation
Chapter 10: Intro to Parameter Estimation
Chapter 11: Measuring the Spread of Data
Chapter 12: Normal Distribution and Confidence
Chapter 13: Tools of Parameter Estimation
Chapter 14: Parameter Estimation with Priors
Part 4: Hypothesis Testing: The Heart of Statistics
Chapter 15: From Parameter Estimation to Hypothesis Testing
Chapter 16: Comparing Hypotheses with Bayes Factor
Chapter 17: Bayesian Reasoning in the Twilight Zone
Chapter 18: When Data Doesn’t Convince You
Chapter 19: From Hypothesis Testing to Parameter Estimation
Appendix A: A Crash Course in R
Appendix B: Enough Calculus to Get By
A Couple of Small Caveats
There is really no downside of this book, but if I had to choose something to change, it would be to omit the two Appendix chapters. Appendix A – A Quick Introduction to R is too brief to be of any real use to a reader who knows nothing of R. I’m a data scientist who uses R to a considerable extent and I teach R in my data science classes and it takes at least 4 lectures of instruction to get up to speed and become productive. I would have simply recommended tutorials, classes, or videos to consume rather than trying to teach R in an 18 page chapter. Appendix B – Enough Calculus to Get By is even less useful. Seriously? Calculus in 12 pages? Curiously, the reader of this book doesn’t even really need to understand Calculus, so why bother?
Conclusion
Bayesian Statistics the Fun Way proves that statistics doesn’t have to be boring and restores its rightful place as a living science that can benefit everybody. Highly recommended book to all, especially early-stage data scientists!