Democratizing Baseball Analytics Education
A free, comprehensive, open-source textbook teaching modern baseball analytics using R and Python. Built for everyone from passionate fans to aspiring MLB analysts.
The MLB Analytics Textbook exists to make professional-quality baseball analytics education accessible to everyone, regardless of background or budget.
Inspired by the success of nflanalytic.com (Brad Congelio's Introduction to NFL Analytics with R) and the open-source nflverse community, we aim to provide the same comprehensive, accessible learning experience for baseball.
Baseball has a rich 20+ year history of analytics innovation. From Bill James's original Baseball Abstracts in the 1980s, to the Moneyball revolution in Oakland, to today's cutting-edge Statcast era, baseball has always been at the forefront of sports analytics. This textbook synthesizes that heritage with modern data science practices.
Our Philosophy
We believe that analytics knowledge should not be locked behind expensive courses or proprietary systems. By making this education freely available, we aim to:
- Expand the pool of qualified baseball analysts
- Enable data-driven journalism and content creation
- Deepen fans' appreciation of the game
- Advance the collective understanding of baseball
This textbook covers the complete spectrum of baseball analytics, from foundational programming skills to advanced machine learning applications.
Programming Fundamentals
Build a solid foundation in both languages
- R with tidyverse (dplyr, ggplot2, purrr)
- Python with pandas and numpy
- Data wrangling best practices
- Visualization techniques
Traditional Sabermetrics
Master the foundational metrics
- wOBA, wRC+, and linear weights
- FIP, xFIP, and SIERA
- WAR calculation and interpretation
- Park factors and adjustments
Statcast Analytics
Harness modern tracking data
- Exit velocity and launch angle
- Spin rate and pitch movement
- Expected statistics (xwOBA, xBA, xSLG)
- Sprint speed and baserunning
Machine Learning
Build predictive models
- Player performance projections
- Pitch classification models
- Custom metric development
- Model evaluation techniques
Fielding & Catcher Analytics
The hardest problems in baseball
- Outs Above Average (OAA)
- Catch probability models
- Framing runs and catcher defense
- Sprint speed applications
Interactive Applications
Build and deploy dashboards
- R Shiny applications
- Python Streamlit dashboards
- Interactive Plotly visualizations
- Deployment to the cloud
This textbook is designed to be accessible to beginners while providing enough depth to challenge experienced analysts. Whether you're just curious about analytics or preparing for a career in baseball operations, you'll find value here.
Aspiring Front Office Analysts
Build the exact skills MLB teams look for. Learn industry-standard tools, develop portfolio projects, and understand the analytical methods used in player evaluation and team strategy.
Fantasy Baseball Enthusiasts
Gain a competitive edge through data-driven player evaluation. Learn to identify breakout candidates, understand regression patterns, and optimize your draft strategy with analytical tools.
Sports Journalists & Content Creators
Tell compelling, data-driven stories. Learn to find insights in the numbers, create compelling visualizations, and communicate complex analytics concepts to general audiences.
Students & Researchers
Perfect for sports analytics courses, capstone projects, or independent study. Combines rigorous statistical methodology with practical applications using real-world data.
Passionate Baseball Fans
Deepen your understanding and appreciation of the game. See baseball through the lens of data and understand the strategy behind every managerial decision and front office move.
Learn Concepts
Each chapter explains the theory and methodology behind key analytics concepts with clear prose.
Study Code
Every concept includes working code examples in both R and Python using real MLB data.
Practice
End-of-chapter exercises reinforce your learning with hands-on problems to solve.
Following the model pioneered by the nflverse community in football analytics, this project is completely free and open source. All code is publicly available, and contributions are welcome from the community.
We believe that democratizing access to analytics education benefits everyone: fans, analysts, journalists, and the sport itself. When more people understand analytics, the conversation around baseball becomes richer and more informed.
100% Free
No subscriptions, no paywalls, no hidden costs. Free forever.
Open Source
All code is public. Fork it, improve it, make it your own.
Community Driven
Contributions welcome. Help us improve and expand.
This project stands on the shoulders of giants. We are deeply grateful to the individuals and organizations who have made baseball analytics accessible to all:
Have questions, suggestions, or want to contribute? We'd love to hear from you! Whether you've found a bug, have an idea for new content, or just want to say hello, please reach out.