Chapter 5: Traditional Sabermetrics

Sabermetrics revolutionized baseball analysis by asking a fundamental question: how do we objectively measure player value? This chapter explores the foundational metrics that changed how we evaluate players, from understanding run values to calculating comprehensive metrics like WAR (Wins Above Replacement). We'll examine the philosophy behind these measurements, implement them in code, and learn to interpret their meaning in context.

Intermediate ~20 min read 11 sections 42 code examples
Book Progress
11%
Chapter 6 of 54
What You'll Learn
  • The Philosophy of Sabermetrics
  • Rate Statistics and Denominators
  • Linear Weights
  • Weighted On-Base Average (wOBA)
  • And 7 more topics...
Languages in This Chapter
R (23) Python (19)

All code examples can be copied and run in your environment.

5.1 The Philosophy of Sabermetrics

5.1.1 First Principles: What Are We Measuring?

At its core, baseball is about winning games. Teams win games by scoring more runs than they allow. Therefore, any meaningful player evaluation metric must ultimately connect to run production or run prevention.

This simple chain of logic forms the foundation of sabermetrics:

The Value Chain:


  1. Ultimate Goal: Win games

  2. Proximate Goal: Outscore opponents (runs)

  3. Component Skills: Get on base, advance runners, prevent runs

  4. Observable Events: Hits, walks, outs, strikeouts, etc.

Traditional batting average (AVG) fails this test because it ignores walks (valuable) and treats all hits equally (a double is worth more than a single). Sabermetrics seeks to weight each event by its actual run value.

Breaking Down Runs:

Research by Pete Palmer, Bill James, Tom Tango, and others has shown that we can assign a run value to every event that occurs in baseball. A home run is worth approximately 1.4 runs, a double around 0.8 runs, a walk about 0.3 runs, and an out roughly -0.3 runs. These values vary slightly by league and era, but the principle remains constant: measure what matters for run scoring.

5.1.2 Context and Adjustment: League Environment, Park Factors, Era Adjustments

Baseball is not played in a vacuum. A .280 batting average in the dead-ball era of 1968 (when the league averaged .237) is far more impressive than .280 in the offensive explosion of 2000 (league average .270). Similarly, hitting at Coors Field in Denver offers significant advantages compared to Oracle Park in San Francisco.

Three Critical Adjustments:

League Environment: Offense fluctuates dramatically across eras. The 2023 MLB batting average was .248, but it was .271 in 1999 and .230 in 1968. Context-neutral metrics adjust for these league-wide differences.

Park Factors: Ballparks differ in dimensions, altitude, wall heights, and prevailing winds. Coors Field inflates run scoring by approximately 15%, while Oracle Park suppresses it by about 10%. Advanced metrics adjust for these differences to compare players on equal footing.

Era Adjustments: When comparing players across decades, we must account for rule changes, equipment differences, league expansion, and talent pool variations. A 150 wRC+ (50% better than average) represents elite performance in any era.

5.1.3 Descriptive vs. Predictive: Actual vs. Expected vs. Projections

Sabermetrics distinguishes between three types of statistics:

Descriptive (What Happened): These stats measure actual results. ERA describes the runs a pitcher actually allowed. Batting average describes the hits a batter actually collected. These are historical facts.

Predictive (What Should Happen): These stats estimate true talent by removing luck and variance. FIP (Fielding Independent Pitching) estimates what a pitcher's ERA should have been based on strikeouts, walks, and home runs—events the pitcher controls. BABIP (Batting Average on Balls in Play) helps identify lucky or unlucky hitters.

Projections (What Will Happen): Systems like Steamer, ZiPS, and THE BAT project future performance using historical data, aging curves, and regression to the mean. These forecast future statistics.

Understanding these distinctions prevents analytical errors. A pitcher with a 2.50 ERA and 4.00 FIP probably got lucky and will regress. A hitter with a .350 BABIP (league average ~.300) likely benefited from good fortune.



5.2 Rate Statistics and Denominators

5.2.1 Plate Appearances vs. At-Bats: The Critical Distinction

The choice of denominator fundamentally shapes a statistic's meaning. This is nowhere more critical than the distinction between plate appearances (PA) and at-bats (AB).

At-Bats (AB): Plate appearances that resulted in a hit, out, or error. Excludes walks, hit-by-pitches, sacrifice flies, and sacrifice bunts.

Plate Appearances (PA): Every time a batter completes a turn at the plate. Includes everything.

Why This Matters:

Batting average uses at-bats as the denominator: AVG = H / AB. This creates a perverse incentive—walking doesn't help your batting average, and in fact, if you walk frequently, you get fewer opportunities to raise your average. A player who walks 150 times per season has 150 fewer chances to improve their batting average.

On-base percentage uses plate appearances: OBP = (H + BB + HBP) / PA. This correctly treats walks as valuable events that help your team score runs.

The Formula Difference:

AVG = H / AB
OBP = (H + BB + HBP) / (AB + BB + HBP + SF)

Mike Trout's 2023 season illustrates this perfectly:


  • 254 AB, 67 H = .264 AVG (below average)

  • 326 PA, 67 H + 58 BB + 2 HBP = .390 OBP (elite)

Using AB as the denominator for AVG makes walks invisible. Using PA for OBP correctly accounts for them.

5.2.2 Common Rate Statistics

Here's a comprehensive table of essential rate statistics:

StatisticFormulaDenominatorWhat It MeasuresLeague Avg (2023)
AVGH / ABAt-BatsHit rate on plate appearances that aren't walks.248
OBP(H + BB + HBP) / PAPlate AppearancesRate of reaching base safely.320
SLGTB / ABAt-BatsPower; total bases per at-bat.412
OPSOBP + SLGMixedCombined on-base and power (imperfect).732
ISOSLG - AVGAt-BatsIsolated power; extra bases per at-bat.164
BABIP(H - HR) / (AB - K - HR + SF)Balls in PlayHit rate on balls put in play.295
K%K / PAPlate AppearancesStrikeout rate22.7%
BB%BB / PAPlate AppearancesWalk rate8.7%
HR/FB%HR / FBFly BallsHome run rate on fly balls13.5%

Key Insights:

  • ISO (Isolated Power) removes the singles component from slugging, measuring only extra-base power. A .200 ISO is excellent; .250+ is elite.
  • BABIP typically regresses toward .300 for most hitters. Values significantly above .340 or below .260 often indicate luck.
  • K% and BB% are among the most stable and predictive offensive metrics. They stabilize quickly and correlate strongly with future performance.

5.2.3 Stabilization Rates: How Many PA for Each Stat to Stabilize

Not all statistics stabilize at the same rate. Some metrics reveal a player's true talent quickly; others require enormous sample sizes. Russell Carleton's research (formerly at Baseball Prospectus) determined the approximate plate appearances needed for each statistic to become 50% reliable—meaning half the variance is from true talent, half from luck.

Stabilization Rates Table:

StatisticPAs to Stabilize (50% reliability)Interpretation
K%60Stabilizes very quickly; ~2 weeks
BB%120Stabilizes quickly; ~1 month
HR/FB%150Relatively quick
GB/FB ratio80Quick stabilization
ISO160Moderately quick
BABIP820Very slow; influenced by defense/luck
AVG910Very slow; noisy metric
OBP460Moderate
SLG320Moderate
wOBA240Faster than traditional stats

Practical Implications:

After 120 PAs (~1 month), a hitter's K% and BB% are reasonably trustworthy indicators of their approach. If a career 25% strikeout hitter suddenly shows 18% K-rate through 150 PAs, that's likely real improvement.

However, BABIP requires nearly a full season to stabilize. A .380 BABIP through 200 PAs tells us almost nothing—it's probably luck. Even over a full season (600 PA), BABIP remains heavily influenced by factors beyond the hitter's control (defense, park, luck).

This is why modern metrics like wOBA stabilize faster than batting average—they combine multiple components with different stabilization rates and weight them appropriately.


R
AVG = H / AB
OBP = (H + BB + HBP) / (AB + BB + HBP + SF)

5.3 Linear Weights

5.3.1 The Concept: Each Event Has a Run Value

Linear weights form the mathematical foundation of modern run estimation. The concept is elegant: determine the average number of runs each type of event (single, walk, home run, out, etc.) contributes to scoring.

The Methodology:

Researchers analyzed thousands of games to calculate run expectancy matrices—how many runs a team scores on average from each base-out state. For example:


  • Bases empty, 0 outs: 0.481 runs expected (on average) for the rest of the inning

  • Runner on first, 0 outs: 0.859 runs expected

  • Bases loaded, 1 out: 1.447 runs expected

When a single moves the state from "bases empty, 0 outs" to "runner on first, 0 outs," the change in run expectancy is:
0.859 - 0.481 = 0.378 runs

Aggregate across millions of plate appearances, and we derive the average run value for each event type.

Why Linear?

These weights are "linear" because we simply multiply each event by its weight and add them up. If a player hits 30 doubles, we multiply 30 × 0.78 (the run value of a double). This linearity makes calculations straightforward and transparent.

5.3.2 Run Values by Event (2023 Values)

The FanGraphs Guts table provides annually updated run values based on each season's actual data. Here are the 2023 values:

2023 Run Values Table:

EventRun ValueExplanation
Home Run (HR)1.396Most valuable; instant run + no outs consumed
Triple (3B)1.070Very rare; runner on third with no outs typically scores
Double (2B)0.776Strong extra-base hit; runner in scoring position
Single (1B)0.475Gets on base; modest run value
Walk (BB)0.320Reaches base but doesn't advance runners as far
Hit-by-Pitch (HBP)0.323Similar value to walk
Stolen Base (SB)0.175Advances one base; moderate value
Caught Stealing (CS)-0.467Loses baserunner; significant negative
Out (non-K)-0.299Consumes an out; can sometimes advance runners
Strikeout (K)-0.301Slightly worse than normal out; no ball in play

Important Notes:

These values change slightly each year based on the league offensive environment. In higher-scoring environments, each event becomes slightly more valuable. The FanGraphs "Guts" page publishes updated values annually.

5.3.3 Calculating Linear Weights with Code

Let's calculate linear weights-based runs created using both Python and R, pulling the actual run values from FanGraphs guts data.

Python Implementation:

import pybaseball as pyb
import pandas as pd
from pybaseball import batting_stats, fg_guts

# Enable cache to speed up repeated queries
pyb.cache.enable()

# Get 2023 FanGraphs constants (run values)
guts = fg_guts(2023)
print("2023 Run Values from FanGraphs:")
print(guts[['wBB', 'wHBP', 'w1B', 'w2B', 'w3B', 'wHR']].head())

# Extract run values (per-event values)
wBB = guts['wBB'].values[0]  # Walk
wHBP = guts['wHBP'].values[0]  # Hit by pitch
w1B = guts['w1B'].values[0]  # Single
w2B = guts['w2B'].values[0]  # Double
w3B = guts['w3B'].values[0]  # Triple
wHR = guts['wHR'].values[0]  # Home run

print(f"\n2023 Linear Weights:")
print(f"Walk: {wBB:.3f}")
print(f"Single: {w1B:.3f}")
print(f"Double: {w2B:.3f}")
print(f"Triple: {w3B:.3f}")
print(f"Home Run: {wHR:.3f}")

# Get 2023 batting data for qualified hitters
batting_2023 = batting_stats(2023, qual=502)  # 502 PA = qualified

# Calculate linear weights runs for each player
# First, separate hits into singles
batting_2023['1B'] = (batting_2023['H'] - batting_2023['2B'] -
                      batting_2023['3B'] - batting_2023['HR'])

# Calculate linear weights runs created (above zero baseline)
batting_2023['LW_Runs'] = (
    batting_2023['BB'] * wBB +
    batting_2023['HBP'] * wHBP +
    batting_2023['1B'] * w1B +
    batting_2023['2B'] * w2B +
    batting_2023['3B'] * w3B +
    batting_2023['HR'] * wHR
)

# Show top performers by linear weights
top_lw = batting_2023.nlargest(10, 'LW_Runs')[
    ['Name', 'Team', 'PA', 'HR', '2B', 'BB', 'LW_Runs']
]
print("\nTop 10 Hitters by Linear Weights Runs (2023):")
print(top_lw.to_string(index=False))

# Compare to a single player - Ronald Acuña Jr.
acuna = batting_2023[batting_2023['Name'] == 'Ronald Acuna']
if not acuna.empty:
    print(f"\nRonald Acuña Jr. 2023 Linear Weights Breakdown:")
    print(f"BB: {acuna['BB'].values[0]} × {wBB:.3f} = {acuna['BB'].values[0] * wBB:.1f} runs")
    print(f"1B: {acuna['1B'].values[0]} × {w1B:.3f} = {acuna['1B'].values[0] * w1B:.1f} runs")
    print(f"2B: {acuna['2B'].values[0]} × {w2B:.3f} = {acuna['2B'].values[0] * w2B:.1f} runs")
    print(f"3B: {acuna['3B'].values[0]} × {w3B:.3f} = {acuna['3B'].values[0] * w3B:.1f} runs")
    print(f"HR: {acuna['HR'].values[0]} × {wHR:.3f} = {acuna['HR'].values[0] * wHR:.1f} runs")
    print(f"Total: {acuna['LW_Runs'].values[0]:.1f} runs")

R Implementation:

library(baseballr)
library(dplyr)

# Get 2023 FanGraphs constants
guts_2023 <- fg_guts(2023)

# Extract run values
wBB <- guts_2023$wBB
wHBP <- guts_2023$wHBP
w1B <- guts_2023$w1B
w2B <- guts_2023$w2B
w3B <- guts_2023$w3B
wHR <- guts_2023$wHR

cat("2023 Linear Weights:\n")
cat(sprintf("Walk: %.3f\n", wBB))
cat(sprintf("Single: %.3f\n", w1B))
cat(sprintf("Double: %.3f\n", w2B))
cat(sprintf("Triple: %.3f\n", w3B))
cat(sprintf("Home Run: %.3f\n", wHR))

# Get 2023 qualified batting data
batting_2023 <- fg_batter_leaders(2023, 2023, qual = 502)

# Calculate singles (hits minus extra-base hits)
batting_2023 <- batting_2023 %>%
  mutate(
    Singles = H - (`2B` + `3B` + HR),
    LW_Runs = BB * wBB + HBP * wHBP + Singles * w1B +
              `2B` * w2B + `3B` * w3B + HR * wHR
  )

# Top performers
top_lw <- batting_2023 %>%
  select(Name, Team, PA, HR, `2B`, BB, LW_Runs) %>%
  arrange(desc(LW_Runs)) %>%
  head(10)

print("Top 10 Hitters by Linear Weights Runs (2023):")
print(top_lw)

# Ronald Acuña Jr. breakdown
acuna <- batting_2023 %>% filter(grepl("Acuña", Name))

if(nrow(acuna) > 0) {
  cat("\nRonald Acuña Jr. 2023 Linear Weights Breakdown:\n")
  cat(sprintf("BB: %d × %.3f = %.1f runs\n",
              acuna$BB, wBB, acuna$BB * wBB))
  cat(sprintf("1B: %d × %.3f = %.1f runs\n",
              acuna$Singles, w1B, acuna$Singles * w1B))
  cat(sprintf("2B: %d × %.3f = %.1f runs\n",
              acuna$`2B`, w2B, acuna$`2B` * w2B))
  cat(sprintf("3B: %d × %.3f = %.1f runs\n",
              acuna$`3B`, w3B, acuna$`3B` * w3B))
  cat(sprintf("HR: %d × %.3f = %.1f runs\n",
              acuna$HR, wHR, acuna$HR * wHR))
  cat(sprintf("Total: %.1f runs\n", acuna$LW_Runs))
}

This linear weights foundation directly leads to wOBA, our next topic.


R
library(baseballr)
library(dplyr)

# Get 2023 FanGraphs constants
guts_2023 <- fg_guts(2023)

# Extract run values
wBB <- guts_2023$wBB
wHBP <- guts_2023$wHBP
w1B <- guts_2023$w1B
w2B <- guts_2023$w2B
w3B <- guts_2023$w3B
wHR <- guts_2023$wHR

cat("2023 Linear Weights:\n")
cat(sprintf("Walk: %.3f\n", wBB))
cat(sprintf("Single: %.3f\n", w1B))
cat(sprintf("Double: %.3f\n", w2B))
cat(sprintf("Triple: %.3f\n", w3B))
cat(sprintf("Home Run: %.3f\n", wHR))

# Get 2023 qualified batting data
batting_2023 <- fg_batter_leaders(2023, 2023, qual = 502)

# Calculate singles (hits minus extra-base hits)
batting_2023 <- batting_2023 %>%
  mutate(
    Singles = H - (`2B` + `3B` + HR),
    LW_Runs = BB * wBB + HBP * wHBP + Singles * w1B +
              `2B` * w2B + `3B` * w3B + HR * wHR
  )

# Top performers
top_lw <- batting_2023 %>%
  select(Name, Team, PA, HR, `2B`, BB, LW_Runs) %>%
  arrange(desc(LW_Runs)) %>%
  head(10)

print("Top 10 Hitters by Linear Weights Runs (2023):")
print(top_lw)

# Ronald Acuña Jr. breakdown
acuna <- batting_2023 %>% filter(grepl("Acuña", Name))

if(nrow(acuna) > 0) {
  cat("\nRonald Acuña Jr. 2023 Linear Weights Breakdown:\n")
  cat(sprintf("BB: %d × %.3f = %.1f runs\n",
              acuna$BB, wBB, acuna$BB * wBB))
  cat(sprintf("1B: %d × %.3f = %.1f runs\n",
              acuna$Singles, w1B, acuna$Singles * w1B))
  cat(sprintf("2B: %d × %.3f = %.1f runs\n",
              acuna$`2B`, w2B, acuna$`2B` * w2B))
  cat(sprintf("3B: %d × %.3f = %.1f runs\n",
              acuna$`3B`, w3B, acuna$`3B` * w3B))
  cat(sprintf("HR: %d × %.3f = %.1f runs\n",
              acuna$HR, wHR, acuna$HR * wHR))
  cat(sprintf("Total: %.1f runs\n", acuna$LW_Runs))
}
Python
import pybaseball as pyb
import pandas as pd
from pybaseball import batting_stats, fg_guts

# Enable cache to speed up repeated queries
pyb.cache.enable()

# Get 2023 FanGraphs constants (run values)
guts = fg_guts(2023)
print("2023 Run Values from FanGraphs:")
print(guts[['wBB', 'wHBP', 'w1B', 'w2B', 'w3B', 'wHR']].head())

# Extract run values (per-event values)
wBB = guts['wBB'].values[0]  # Walk
wHBP = guts['wHBP'].values[0]  # Hit by pitch
w1B = guts['w1B'].values[0]  # Single
w2B = guts['w2B'].values[0]  # Double
w3B = guts['w3B'].values[0]  # Triple
wHR = guts['wHR'].values[0]  # Home run

print(f"\n2023 Linear Weights:")
print(f"Walk: {wBB:.3f}")
print(f"Single: {w1B:.3f}")
print(f"Double: {w2B:.3f}")
print(f"Triple: {w3B:.3f}")
print(f"Home Run: {wHR:.3f}")

# Get 2023 batting data for qualified hitters
batting_2023 = batting_stats(2023, qual=502)  # 502 PA = qualified

# Calculate linear weights runs for each player
# First, separate hits into singles
batting_2023['1B'] = (batting_2023['H'] - batting_2023['2B'] -
                      batting_2023['3B'] - batting_2023['HR'])

# Calculate linear weights runs created (above zero baseline)
batting_2023['LW_Runs'] = (
    batting_2023['BB'] * wBB +
    batting_2023['HBP'] * wHBP +
    batting_2023['1B'] * w1B +
    batting_2023['2B'] * w2B +
    batting_2023['3B'] * w3B +
    batting_2023['HR'] * wHR
)

# Show top performers by linear weights
top_lw = batting_2023.nlargest(10, 'LW_Runs')[
    ['Name', 'Team', 'PA', 'HR', '2B', 'BB', 'LW_Runs']
]
print("\nTop 10 Hitters by Linear Weights Runs (2023):")
print(top_lw.to_string(index=False))

# Compare to a single player - Ronald Acuña Jr.
acuna = batting_2023[batting_2023['Name'] == 'Ronald Acuna']
if not acuna.empty:
    print(f"\nRonald Acuña Jr. 2023 Linear Weights Breakdown:")
    print(f"BB: {acuna['BB'].values[0]} × {wBB:.3f} = {acuna['BB'].values[0] * wBB:.1f} runs")
    print(f"1B: {acuna['1B'].values[0]} × {w1B:.3f} = {acuna['1B'].values[0] * w1B:.1f} runs")
    print(f"2B: {acuna['2B'].values[0]} × {w2B:.3f} = {acuna['2B'].values[0] * w2B:.1f} runs")
    print(f"3B: {acuna['3B'].values[0]} × {w3B:.3f} = {acuna['3B'].values[0] * w3B:.1f} runs")
    print(f"HR: {acuna['HR'].values[0]} × {wHR:.3f} = {acuna['HR'].values[0] * wHR:.1f} runs")
    print(f"Total: {acuna['LW_Runs'].values[0]:.1f} runs")

5.4 Weighted On-Base Average (wOBA)

5.4.1 Why wOBA Over OPS? Proper Weighting, Scaled to OBP

For years, OPS (On-Base Plus Slugging) served as the go-to "advanced" batting statistic. It's simple: add OBP and SLG. But OPS has fatal flaws:

Problems with OPS:

  1. Wrong Weights: A walk (worth ~0.32 runs) counts the same as a single (~0.48 runs) in the OBP component
  2. Illogical Math: Adding OBP and SLG adds percentages with different denominators (PA vs. AB)
  3. Arbitrary Scale: The resulting number has no intuitive meaning
  4. Wrong Ratio: OPS weights OBP and SLG equally (50/50), but research shows OBP matters more (roughly 60/40)

wOBA's Advantages:

Weighted On-Base Average (wOBA) solves these problems:

  1. Correct Weights: Uses linear weights to properly value each event
  2. Scaled to OBP: A .350 wOBA means roughly the same as .350 OBP—intuitive
  3. Better Correlation: Correlates more strongly with runs scored than OPS
  4. Theoretically Sound: Based on actual run values, not arbitrary combinations

The wOBA Formula:

wOBA = (wBB×BB + wHBP×HBP + w1B×1B + w2B×2B + w3B×3B + wHR×HR) / PA

Where the weights (wBB, w1B, etc.) come from linear weights, scaled to OBP-like values.

5.4.2 wOBA Scale

Understanding what constitutes good, average, or excellent performance in wOBA:

wOBA Performance Scale (2023 Context):

RatingwOBA RangeDescription2023 Example Players
Excellent.400+Elite; MVP-caliber offenseRonald Acuña (.435), Mookie Betts (.405)
Great.360-.399All-Star level; premium hitterFreddie Freeman (.393), Corey Seager (.366)
Above Average.330-.359Solid regular; positive contributorBo Bichette (.338), Yandy Díaz (.351)
Average.310-.329League average hitterLeague Average: .321
Below Average.290-.309Replacement to below-averageByron Buxton (.295), Amed Rosario (.293)
Poor.270-.289Well below average; liabilityDefensive specialists
Awful<.270Should not be batting regularlyExtreme defensive players only

League Context:

  • 2023 MLB Average: .321 wOBA
  • 2019 (high offense): .322 wOBA
  • 2014 (low offense): .314 wOBA

The scale remains relatively consistent because wOBA is designed to track with league OBP.

5.4.3 From wOBA to wRAA with Code

While wOBA is a rate stat (per PA), we often want to know total runs above average. This is wRAA (weighted Runs Above Average):

wRAA = ((wOBA - league_wOBA) / wOBA_scale) × PA

The wOBAscale factor converts wOBA points to runs. In 2023, wOBAscale was approximately 1.216.

Python Implementation:

import pybaseball as pyb
import pandas as pd
from pybaseball import batting_stats, fg_guts

pyb.cache.enable()

# Get 2023 data
guts = fg_guts(2023)
batting_2023 = batting_stats(2023, qual=502)

# Get wOBA scale and league average wOBA from guts
woba_scale = guts['wOBAScale'].values[0]
lg_woba = guts['lg_wOBA'].values[0]

print(f"2023 wOBA Scale: {woba_scale:.3f}")
print(f"2023 League wOBA: {lg_woba:.3f}")

# wOBA is already calculated in FanGraphs data
# Calculate wRAA manually
batting_2023['wRAA_manual'] = (
    (batting_2023['wOBA'] - lg_woba) / woba_scale * batting_2023['PA']
)

# Compare to FanGraphs wRAA (should be very close)
comparison = batting_2023[['Name', 'PA', 'wOBA', 'wRAA', 'wRAA_manual']].head(10)
print("\nwRAA Calculation Comparison:")
print(comparison.to_string(index=False))

# Top performers by wRAA
top_wraa = batting_2023.nlargest(15, 'wRAA')[
    ['Name', 'Team', 'PA', 'wOBA', 'wRAA', 'WAR']
]
print("\nTop 15 Hitters by wRAA (2023):")
print(top_wraa.to_string(index=False))

# Interpret a player's wRAA
print("\nInterpreting wRAA:")
print("wRAA = 0: League average hitter")
print("wRAA = +20: 20 runs better than average over the season")
print("wRAA = -10: 10 runs worse than average")
print("\nRonald Acuña Jr.'s wRAA shows he created ~80 runs more than")
print("an average hitter would with the same number of plate appearances.")

R Implementation:

library(baseballr)
library(dplyr)

# Get 2023 constants and batting data
guts_2023 <- fg_guts(2023)
batting_2023 <- fg_batter_leaders(2023, 2023, qual = 502)

# Extract scaling factors
woba_scale <- guts_2023$wOBAScale
lg_woba <- guts_2023$lg_wOBA

cat(sprintf("2023 wOBA Scale: %.3f\n", woba_scale))
cat(sprintf("2023 League wOBA: %.3f\n", lg_woba))

# Calculate wRAA manually
batting_2023 <- batting_2023 %>%
  mutate(
    wRAA_manual = ((wOBA - lg_woba) / woba_scale) * PA
  )

# Top performers
top_wraa <- batting_2023 %>%
  select(Name, Team, PA, wOBA, wRAA, WAR) %>%
  arrange(desc(wRAA)) %>%
  head(15)

cat("\nTop 15 Hitters by wRAA (2023):\n")
print(top_wraa)

# Interpretation
cat("\nInterpreting wRAA:\n")
cat("wRAA = 0: League average hitter\n")
cat("wRAA = +20: 20 runs better than average over the season\n")
cat("wRAA = -10: 10 runs worse than average\n")

Key Insight: wRAA is cumulative (more PA = more runs), while wOBA is a rate stat. A hitter with .360 wOBA over 700 PA contributes more total runs than someone with .360 wOBA over 400 PA.


R
wOBA = (wBB×BB + wHBP×HBP + w1B×1B + w2B×2B + w3B×3B + wHR×HR) / PA
R
wRAA = ((wOBA - league_wOBA) / wOBA_scale) × PA
R
library(baseballr)
library(dplyr)

# Get 2023 constants and batting data
guts_2023 <- fg_guts(2023)
batting_2023 <- fg_batter_leaders(2023, 2023, qual = 502)

# Extract scaling factors
woba_scale <- guts_2023$wOBAScale
lg_woba <- guts_2023$lg_wOBA

cat(sprintf("2023 wOBA Scale: %.3f\n", woba_scale))
cat(sprintf("2023 League wOBA: %.3f\n", lg_woba))

# Calculate wRAA manually
batting_2023 <- batting_2023 %>%
  mutate(
    wRAA_manual = ((wOBA - lg_woba) / woba_scale) * PA
  )

# Top performers
top_wraa <- batting_2023 %>%
  select(Name, Team, PA, wOBA, wRAA, WAR) %>%
  arrange(desc(wRAA)) %>%
  head(15)

cat("\nTop 15 Hitters by wRAA (2023):\n")
print(top_wraa)

# Interpretation
cat("\nInterpreting wRAA:\n")
cat("wRAA = 0: League average hitter\n")
cat("wRAA = +20: 20 runs better than average over the season\n")
cat("wRAA = -10: 10 runs worse than average\n")
Python
import pybaseball as pyb
import pandas as pd
from pybaseball import batting_stats, fg_guts

pyb.cache.enable()

# Get 2023 data
guts = fg_guts(2023)
batting_2023 = batting_stats(2023, qual=502)

# Get wOBA scale and league average wOBA from guts
woba_scale = guts['wOBAScale'].values[0]
lg_woba = guts['lg_wOBA'].values[0]

print(f"2023 wOBA Scale: {woba_scale:.3f}")
print(f"2023 League wOBA: {lg_woba:.3f}")

# wOBA is already calculated in FanGraphs data
# Calculate wRAA manually
batting_2023['wRAA_manual'] = (
    (batting_2023['wOBA'] - lg_woba) / woba_scale * batting_2023['PA']
)

# Compare to FanGraphs wRAA (should be very close)
comparison = batting_2023[['Name', 'PA', 'wOBA', 'wRAA', 'wRAA_manual']].head(10)
print("\nwRAA Calculation Comparison:")
print(comparison.to_string(index=False))

# Top performers by wRAA
top_wraa = batting_2023.nlargest(15, 'wRAA')[
    ['Name', 'Team', 'PA', 'wOBA', 'wRAA', 'WAR']
]
print("\nTop 15 Hitters by wRAA (2023):")
print(top_wraa.to_string(index=False))

# Interpret a player's wRAA
print("\nInterpreting wRAA:")
print("wRAA = 0: League average hitter")
print("wRAA = +20: 20 runs better than average over the season")
print("wRAA = -10: 10 runs worse than average")
print("\nRonald Acuña Jr.'s wRAA shows he created ~80 runs more than")
print("an average hitter would with the same number of plate appearances.")

5.5 Weighted Runs Created Plus (wRC+)

5.5.1 Why Use wRC+? Park/League/Era Adjusted, 100 = Average

While wRAA tells us runs above average, wRC+ provides a scaled, adjusted metric that's even more intuitive:

wRC+ Advantages:

  1. Park Adjusted: Neutralizes Coors Field inflation or Oracle Park suppression
  2. League Adjusted: Accounts for AL/NL differences and year-to-year changes
  3. Era Adjusted: Enables comparisons across different offensive environments
  4. Intuitive Scale: 100 = league average, 150 = 50% better than average

The Formula:

wRC+ = (((wRAA/PA + league_R/PA) + (league_R/PA - park_factor×league_R/PA)) /
        (AL or NL wRC/PA excluding pitchers)) × 100

In practice, it's simpler: wRC+ adjusts wRAA for park and league, then scales it so 100 = average.

5.5.2 Interpreting wRC+

wRC+ Performance Scale:

wRC+ RangeRatingInterpretationHistorical Comparison
160+MVP-caliberElite; historically greatBarry Bonds (2001-04), Mike Trout (2012-19)
140-159ExcellentPerennial All-StarMookie Betts, Aaron Judge, Ronald Acuña
120-139Above AverageQuality regular; borderline All-StarMost starting OF/1B on playoff teams
110-119Slightly Above AvgSolid everyday playerTypical #5-6 hitter
90-109AverageLeague average hitterReplacement-level starter
80-89Below AverageBelow average; needs defenseDefensive specialists
70-79PoorSignificant offensive liabilityBackup catcher, elite defender
<70AwfulShould not bat regularlyPitchers (NL), extreme cases

2023 wRC+ Leaders:

  • Ronald Acuña Jr.: 173 wRC+ (73% better than average)
  • Mookie Betts: 158 wRC+
  • Freddie Freeman: 156 wRC+
  • Corey Seager: 151 wRC+

Cross-Era Comparison:

wRC+ enables us to compare players across eras:


  • Babe Ruth's 1923: 239 wRC+ (139% better than average)

  • Barry Bonds' 2002: 244 wRC+

  • Ronald Acuña Jr.'s 2023: 173 wRC+

  • Average MLB player: 100 wRC+

This tells us that despite vastly different raw statistics, we can objectively compare Ruth, Bonds, and Acuña.

5.5.3 Calculating wRC+ with Code

Python Implementation:

import pybaseball as pyb
import pandas as pd
from pybaseball import batting_stats

pyb.cache.enable()

# Get 2023 batting data
batting_2023 = batting_stats(2023, qual=502)

# wRC+ is already calculated by FanGraphs
# Let's examine it and create visualizations

# Top performers
top_wrcplus = batting_2023.nlargest(20, 'wRC+')[
    ['Name', 'Team', 'PA', 'AVG', 'OBP', 'SLG', 'wOBA', 'wRC+', 'WAR']
]
print("Top 20 Hitters by wRC+ (2023):")
print(top_wrcplus.to_string(index=False))

# Compare wRC+ to traditional stats
print("\nwRC+ vs. Traditional Stats:")
print(f"Correlation with AVG: {batting_2023['wRC+'].corr(batting_2023['AVG']):.3f}")
print(f"Correlation with OPS: {batting_2023['wRC+'].corr(batting_2023['OPS']):.3f}")
print(f"Correlation with wOBA: {batting_2023['wRC+'].corr(batting_2023['wOBA']):.3f}")

# Create performance tiers
def wrcplus_tier(wrc):
    if wrc >= 160: return "MVP-caliber"
    elif wrc >= 140: return "Excellent"
    elif wrc >= 120: return "Above Average"
    elif wrc >= 110: return "Slightly Above Avg"
    elif wrc >= 90: return "Average"
    elif wrc >= 80: return "Below Average"
    elif wrc >= 70: return "Poor"
    else: return "Awful"

batting_2023['Tier'] = batting_2023['wRC+'].apply(wrcplus_tier)

# Count players in each tier
tier_counts = batting_2023['Tier'].value_counts()
print("\nQualified Hitters by wRC+ Tier (2023):")
print(tier_counts)

# Show some interesting cases
print("\nInteresting Cases:")

# High AVG, low wRC+ (singles hitters, no walks)
low_power = batting_2023[(batting_2023['AVG'] > .280) &
                         (batting_2023['wRC+'] < 110)][
    ['Name', 'AVG', 'OBP', 'SLG', 'BB%', 'ISO', 'wRC+']
].head(5)
print("\nHigh AVG but Low wRC+ (empty batting average):")
print(low_power.to_string(index=False))

# Low AVG, high wRC+ (power + walks)
high_power = batting_2023[(batting_2023['AVG'] < .240) &
                          (batting_2023['wRC+'] > 120)][
    ['Name', 'AVG', 'OBP', 'SLG', 'BB%', 'HR', 'wRC+']
].head(5)
print("\nLow AVG but High wRC+ (power + patience):")
print(high_power.to_string(index=False))

R Implementation:

library(baseballr)
library(dplyr)

# Get 2023 batting data
batting_2023 <- fg_batter_leaders(2023, 2023, qual = 502)

# Top performers by wRC+
top_wrcplus <- batting_2023 %>%
  select(Name, Team, PA, AVG, OBP, SLG, wOBA, `wRC+`, WAR) %>%
  arrange(desc(`wRC+`)) %>%
  head(20)

cat("Top 20 Hitters by wRC+ (2023):\n")
print(top_wrcplus)

# Correlations
cat("\nwRC+ vs. Traditional Stats:\n")
cat(sprintf("Correlation with AVG: %.3f\n",
            cor(batting_2023$`wRC+`, batting_2023$AVG)))
cat(sprintf("Correlation with OPS: %.3f\n",
            cor(batting_2023$`wRC+`, batting_2023$OPS)))

# Create performance tiers
batting_2023 <- batting_2023 %>%
  mutate(
    Tier = case_when(
      `wRC+` >= 160 ~ "MVP-caliber",
      `wRC+` >= 140 ~ "Excellent",
      `wRC+` >= 120 ~ "Above Average",
      `wRC+` >= 110 ~ "Slightly Above Avg",
      `wRC+` >= 90 ~ "Average",
      `wRC+` >= 80 ~ "Below Average",
      `wRC+` >= 70 ~ "Poor",
      TRUE ~ "Awful"
    )
  )

# Count by tier
tier_counts <- batting_2023 %>%
  group_by(Tier) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count))

cat("\nQualified Hitters by wRC+ Tier (2023):\n")
print(tier_counts)

# Interesting cases
cat("\nHigh AVG but Low wRC+ (empty batting average):\n")
low_power <- batting_2023 %>%
  filter(AVG > .280, `wRC+` < 110) %>%
  select(Name, AVG, OBP, SLG, `BB%`, ISO, `wRC+`) %>%
  head(5)
print(low_power)

cat("\nLow AVG but High wRC+ (power + patience):\n")
high_power <- batting_2023 %>%
  filter(AVG < .240, `wRC+` > 120) %>%
  select(Name, AVG, OBP, SLG, `BB%`, HR, `wRC+`) %>%
  head(5)
print(high_power)

R
wRC+ = (((wRAA/PA + league_R/PA) + (league_R/PA - park_factor×league_R/PA)) /
        (AL or NL wRC/PA excluding pitchers)) × 100
R
library(baseballr)
library(dplyr)

# Get 2023 batting data
batting_2023 <- fg_batter_leaders(2023, 2023, qual = 502)

# Top performers by wRC+
top_wrcplus <- batting_2023 %>%
  select(Name, Team, PA, AVG, OBP, SLG, wOBA, `wRC+`, WAR) %>%
  arrange(desc(`wRC+`)) %>%
  head(20)

cat("Top 20 Hitters by wRC+ (2023):\n")
print(top_wrcplus)

# Correlations
cat("\nwRC+ vs. Traditional Stats:\n")
cat(sprintf("Correlation with AVG: %.3f\n",
            cor(batting_2023$`wRC+`, batting_2023$AVG)))
cat(sprintf("Correlation with OPS: %.3f\n",
            cor(batting_2023$`wRC+`, batting_2023$OPS)))

# Create performance tiers
batting_2023 <- batting_2023 %>%
  mutate(
    Tier = case_when(
      `wRC+` >= 160 ~ "MVP-caliber",
      `wRC+` >= 140 ~ "Excellent",
      `wRC+` >= 120 ~ "Above Average",
      `wRC+` >= 110 ~ "Slightly Above Avg",
      `wRC+` >= 90 ~ "Average",
      `wRC+` >= 80 ~ "Below Average",
      `wRC+` >= 70 ~ "Poor",
      TRUE ~ "Awful"
    )
  )

# Count by tier
tier_counts <- batting_2023 %>%
  group_by(Tier) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count))

cat("\nQualified Hitters by wRC+ Tier (2023):\n")
print(tier_counts)

# Interesting cases
cat("\nHigh AVG but Low wRC+ (empty batting average):\n")
low_power <- batting_2023 %>%
  filter(AVG > .280, `wRC+` < 110) %>%
  select(Name, AVG, OBP, SLG, `BB%`, ISO, `wRC+`) %>%
  head(5)
print(low_power)

cat("\nLow AVG but High wRC+ (power + patience):\n")
high_power <- batting_2023 %>%
  filter(AVG < .240, `wRC+` > 120) %>%
  select(Name, AVG, OBP, SLG, `BB%`, HR, `wRC+`) %>%
  head(5)
print(high_power)
Python
import pybaseball as pyb
import pandas as pd
from pybaseball import batting_stats

pyb.cache.enable()

# Get 2023 batting data
batting_2023 = batting_stats(2023, qual=502)

# wRC+ is already calculated by FanGraphs
# Let's examine it and create visualizations

# Top performers
top_wrcplus = batting_2023.nlargest(20, 'wRC+')[
    ['Name', 'Team', 'PA', 'AVG', 'OBP', 'SLG', 'wOBA', 'wRC+', 'WAR']
]
print("Top 20 Hitters by wRC+ (2023):")
print(top_wrcplus.to_string(index=False))

# Compare wRC+ to traditional stats
print("\nwRC+ vs. Traditional Stats:")
print(f"Correlation with AVG: {batting_2023['wRC+'].corr(batting_2023['AVG']):.3f}")
print(f"Correlation with OPS: {batting_2023['wRC+'].corr(batting_2023['OPS']):.3f}")
print(f"Correlation with wOBA: {batting_2023['wRC+'].corr(batting_2023['wOBA']):.3f}")

# Create performance tiers
def wrcplus_tier(wrc):
    if wrc >= 160: return "MVP-caliber"
    elif wrc >= 140: return "Excellent"
    elif wrc >= 120: return "Above Average"
    elif wrc >= 110: return "Slightly Above Avg"
    elif wrc >= 90: return "Average"
    elif wrc >= 80: return "Below Average"
    elif wrc >= 70: return "Poor"
    else: return "Awful"

batting_2023['Tier'] = batting_2023['wRC+'].apply(wrcplus_tier)

# Count players in each tier
tier_counts = batting_2023['Tier'].value_counts()
print("\nQualified Hitters by wRC+ Tier (2023):")
print(tier_counts)

# Show some interesting cases
print("\nInteresting Cases:")

# High AVG, low wRC+ (singles hitters, no walks)
low_power = batting_2023[(batting_2023['AVG'] > .280) &
                         (batting_2023['wRC+'] < 110)][
    ['Name', 'AVG', 'OBP', 'SLG', 'BB%', 'ISO', 'wRC+']
].head(5)
print("\nHigh AVG but Low wRC+ (empty batting average):")
print(low_power.to_string(index=False))

# Low AVG, high wRC+ (power + walks)
high_power = batting_2023[(batting_2023['AVG'] < .240) &
                          (batting_2023['wRC+'] > 120)][
    ['Name', 'AVG', 'OBP', 'SLG', 'BB%', 'HR', 'wRC+']
].head(5)
print("\nLow AVG but High wRC+ (power + patience):")
print(high_power.to_string(index=False))

5.6 Pitching Statistics

5.6.1 The Problem with ERA: Defense/Sequencing/Park Dependent

ERA (Earned Run Average) has been baseball's primary pitching metric for over a century. Yet it suffers from significant flaws that make it unreliable for evaluating pitcher performance:

ERA's Problems:

  1. Defense Dependent: A pitcher with excellent defenders behind him benefits enormously. A ground ball to shortstop is an out with Dansby Swanson, possibly a hit with a worse defender.
  1. Sequencing Luck: Two pitchers can allow the same hits and walks but wildly different ERAs based on when those events occur. Hits scattered across innings = low ERA. Hits clustered in one inning = high ERA.
  1. BABIP Variance: Pitchers have limited control over batting average on balls in play (typically regresses to ~.300). A .270 BABIP pitcher likely got lucky; .340 BABIP likely got unlucky.
  1. Park Effects: Pitching in Coors Field inflates ERA; pitching in Oracle Park suppresses it.
  1. Sample Size: ERA fluctuates wildly in small samples. A three-run blowup in one start dramatically affects a pitcher's ERA for weeks.

Example:

Consider two pitchers over 100 batters faced:


  • Pitcher A: Allows 10 hits, 5 walks, 1 HR, scattered across innings: 3.60 ERA

  • Pitcher B: Allows 10 hits, 5 walks, 1 HR, clustered in one bad inning: 5.40 ERA

Same underlying performance, 1.80 ERA difference due to sequencing luck.

This is why sabermetrics developed fielding-independent pitching metrics.

5.6.2 FIP (Fielding Independent Pitching)

FIP isolates the aspects of pitching that are truly under the pitcher's control: strikeouts, walks, hit-by-pitches, and home runs. Defense and luck are removed from the equation.

The FIP Formula:

FIP = ((13×HR + 3×BB + 3×HBP - 2×K) / IP) + FIP_constant

The FIP constant (around 3.10) scales FIP to match ERA on average, making it intuitive.

Why These Events?

  • Home Runs: Pitcher controls these completely; no defense involved
  • Walks/HBP: Pitcher controls these completely
  • Strikeouts: Pitcher controls these completely; defense irrelevant

Coefficients Explained:

The coefficients (13, 3, 3, 2) come from linear weights analysis:


  • Home runs are worth ~13× more than walks in run value

  • Walks and HBP have similar value

  • Strikeouts prevent approximately 2× the runs of a walk

Python Implementation:

import pybaseball as pyb
from pybaseball import pitching_stats, fg_guts
import pandas as pd

pyb.cache.enable()

# Get 2023 pitching data and constants
pitching_2023 = pitching_stats(2023, qual=100)  # 100 IP qualified
guts = fg_guts(2023)

# Get FIP constant
fip_constant = guts['cFIP'].values[0]
print(f"2023 FIP Constant: {fip_constant:.2f}")

# Calculate FIP manually (should match FanGraphs)
pitching_2023['FIP_manual'] = (
    ((13 * pitching_2023['HR'] +
      3 * (pitching_2023['BB'] + pitching_2023['HBP']) -
      2 * pitching_2023['SO']) / pitching_2023['IP']) + fip_constant
)

# Compare ERA to FIP
pitching_2023['ERA_FIP_diff'] = pitching_2023['ERA'] - pitching_2023['FIP']

# Show pitchers who outperformed FIP (likely lucky)
lucky_pitchers = pitching_2023.nsmallest(10, 'ERA_FIP_diff')[
    ['Name', 'Team', 'IP', 'ERA', 'FIP', 'ERA_FIP_diff', 'BABIP', 'LOB%']
]
print("\nTop 10 Pitchers: ERA Much Better Than FIP (Likely Lucky):")
print(lucky_pitchers.to_string(index=False))

# Show pitchers who underperformed FIP (likely unlucky)
unlucky_pitchers = pitching_2023.nlargest(10, 'ERA_FIP_diff')[
    ['Name', 'Team', 'IP', 'ERA', 'FIP', 'ERA_FIP_diff', 'BABIP', 'LOB%']
]
print("\nTop 10 Pitchers: ERA Much Worse Than FIP (Likely Unlucky):")
print(unlucky_pitchers.to_string(index=False))

# Best pitchers by FIP
top_fip = pitching_2023.nsmallest(15, 'FIP')[
    ['Name', 'Team', 'IP', 'ERA', 'FIP', 'K/9', 'BB/9', 'HR/9', 'WAR']
]
print("\nTop 15 Pitchers by FIP (2023):")
print(top_fip.to_string(index=False))

# FIP Scale interpretation
print("\nFIP Scale (similar to ERA):")
print("< 3.00: Excellent")
print("3.00-3.75: Above Average")
print("3.75-4.00: Average")
print("4.00-4.50: Below Average")
print("> 4.50: Poor")

R Implementation:

library(baseballr)
library(dplyr)

# Get 2023 pitching data
pitching_2023 <- fg_pitcher_leaders(2023, 2023, qual = 100)
guts_2023 <- fg_guts(2023)

# Get FIP constant
fip_constant <- guts_2023$cFIP

cat(sprintf("2023 FIP Constant: %.2f\n", fip_constant))

# Calculate ERA-FIP difference
pitching_2023 <- pitching_2023 %>%
  mutate(ERA_FIP_diff = ERA - FIP)

# Lucky pitchers (ERA better than FIP)
lucky_pitchers <- pitching_2023 %>%
  arrange(ERA_FIP_diff) %>%
  select(Name, Team, IP, ERA, FIP, ERA_FIP_diff, BABIP, `LOB%`) %>%
  head(10)

cat("\nTop 10 Pitchers: ERA Much Better Than FIP (Likely Lucky):\n")
print(lucky_pitchers)

# Unlucky pitchers (ERA worse than FIP)
unlucky_pitchers <- pitching_2023 %>%
  arrange(desc(ERA_FIP_diff)) %>%
  select(Name, Team, IP, ERA, FIP, ERA_FIP_diff, BABIP, `LOB%`) %>%
  head(10)

cat("\nTop 10 Pitchers: ERA Much Worse Than FIP (Likely Unlucky):\n")
print(unlucky_pitchers)

# Best by FIP
top_fip <- pitching_2023 %>%
  arrange(FIP) %>%
  select(Name, Team, IP, ERA, FIP, `K/9`, `BB/9`, `HR/9`, WAR) %>%
  head(15)

cat("\nTop 15 Pitchers by FIP (2023):\n")
print(top_fip)

Interpretation:

When ERA significantly differs from FIP:


  • ERA < FIP: Pitcher benefited from good defense, sequencing luck, or low BABIP

  • ERA > FIP: Pitcher suffered from poor defense, bad luck, or high BABIP

  • ERA ≈ FIP: Pitcher's results match their skills

5.6.3 xFIP (Expected FIP)

FIP improves on ERA, but home runs have high variance. Some pitchers chronically allow more or fewer home runs than expected based on their fly ball rate. xFIP addresses this by normalizing home run rate.

The xFIP Formula:

xFIP = ((13×FlyBalls×league_HR/FB% + 3×BB + 3×HBP - 2×K) / IP) + FIP_constant

Instead of actual home runs, xFIP uses: FlyBalls × leagueaverageHR/FB%

Why Normalize HR/FB%?

Research shows that HR/FB% regresses heavily to league average (~13.5% in recent years). A pitcher with 18% HR/FB% likely got unlucky; one with 9% HR/FB% likely got lucky. xFIP predicts future ERA better than FIP by assuming HR/FB% normalizes.

Python Implementation:

import pybaseball as pyb
from pybaseball import pitching_stats, fg_guts

pyb.cache.enable()

# Get 2023 data
pitching_2023 = pitching_stats(2023, qual=100)
guts = fg_guts(2023)

# League average HR/FB%
lg_hrfb = guts['lg_HR_per_FB'].values[0]
print(f"2023 League HR/FB%: {lg_hrfb:.1%}")

# xFIP is already in the data, but let's compare FIP vs xFIP
pitching_2023['FIP_xFIP_diff'] = pitching_2023['FIP'] - pitching_2023['xFIP']

# Pitchers with FIP much better than xFIP (allowing fewer HRs than expected)
lucky_hr = pitching_2023.nsmallest(10, 'FIP_xFIP_diff')[
    ['Name', 'Team', 'IP', 'FIP', 'xFIP', 'HR/9', 'HR/FB', 'FB%']
]
print("\nPitchers with FIP Better Than xFIP (Low HR/FB%):")
print(lucky_hr.to_string(index=False))

# Pitchers with FIP much worse than xFIP (allowing more HRs than expected)
unlucky_hr = pitching_2023.nlargest(10, 'FIP_xFIP_diff')[
    ['Name', 'Team', 'IP', 'FIP', 'xFIP', 'HR/9', 'HR/FB', 'FB%']
]
print("\nPitchers with FIP Worse Than xFIP (High HR/FB%):")
print(unlucky_hr.to_string(index=False))

# Compare ERA, FIP, and xFIP correlations
print("\nCorrelation with Next Year's ERA:")
print("(Note: This requires multi-year data to calculate properly)")
print("Generally: xFIP > FIP > ERA for predictive accuracy")

# Best pitchers by xFIP
top_xfip = pitching_2023.nsmallest(15, 'xFIP')[
    ['Name', 'Team', 'IP', 'ERA', 'FIP', 'xFIP', 'K/9', 'BB/9', 'WAR']
]
print("\nTop 15 Pitchers by xFIP (2023):")
print(top_xfip.to_string(index=False))

R Implementation:

library(baseballr)
library(dplyr)

# Get 2023 data
pitching_2023 <- fg_pitcher_leaders(2023, 2023, qual = 100)
guts_2023 <- fg_guts(2023)

lg_hrfb <- guts_2023$lg_HR_per_FB
cat(sprintf("2023 League HR/FB%%: %.1f%%\n", lg_hrfb * 100))

# Calculate differences
pitching_2023 <- pitching_2023 %>%
  mutate(FIP_xFIP_diff = FIP - xFIP)

# Low HR/FB% pitchers
lucky_hr <- pitching_2023 %>%
  arrange(FIP_xFIP_diff) %>%
  select(Name, Team, IP, FIP, xFIP, `HR/9`, `HR/FB`, `FB%`) %>%
  head(10)

cat("\nPitchers with FIP Better Than xFIP (Low HR/FB%):\n")
print(lucky_hr)

# High HR/FB% pitchers
unlucky_hr <- pitching_2023 %>%
  arrange(desc(FIP_xFIP_diff)) %>%
  select(Name, Team, IP, FIP, xFIP, `HR/9`, `HR/FB`, `FB%`) %>%
  head(10)

cat("\nPitchers with FIP Worse Than xFIP (High HR/FB%):\n")
print(unlucky_hr)

# Best by xFIP
top_xfip <- pitching_2023 %>%
  arrange(xFIP) %>%
  select(Name, Team, IP, ERA, FIP, xFIP, `K/9`, `BB/9`, WAR) %>%
  head(15)

cat("\nTop 15 Pitchers by xFIP (2023):\n")
print(top_xfip)

5.6.4 SIERA (Skill-Interactive ERA)

While FIP and xFIP provide valuable insights, they treat all strikeouts, walks, and balls in play equally. SIERA (Skill-Interactive ERA) recognizes that context matters:

SIERA Innovations:

  1. Interaction Effects: A pitcher with high strikeouts AND low walks is more valuable than the sum of parts
  2. Batted Ball Type: Ground balls and fly balls are valued differently
  3. Context Sensitivity: SIERA adjusts for the relationship between different skills

Why SIERA Matters:

  • A pitcher who strikes out 30% of batters faces fewer balls in play, reducing opportunities for bad luck
  • Ground ball pitchers benefit from lower HR rates even without high strikeout rates
  • The combination of skills matters more than individual components

SIERA Formula:

The formula is complex (involving multiple interaction terms), so we rely on FanGraphs' calculation:

# SIERA is provided by FanGraphs
pitching_2023 = pitching_stats(2023, qual=100)

# Compare SIERA to other metrics
comparison = pitching_2023[['Name', 'IP', 'ERA', 'FIP', 'xFIP', 'SIERA', 'K%', 'BB%', 'GB%']].head(20)
print(comparison)

SIERA Scale: Similar to ERA (3.00 = excellent, 4.50 = poor)

5.6.5 Pitching Stat Comparison Table

ERA vs. FIP vs. xFIP vs. SIERA:

MetricWhat It MeasuresStrengthsWeaknessesBest Use Case
ERAActual earned runs allowedDescribes what happenedDefense/luck/park dependentHistorical record
FIPExpected runs based on K/BB/HRRemoves defense and sequencingTreats all HR equallyIsolating pitcher skill
xFIPExpected runs with normalized HR/FBBetter predictor than FIPMay overcorrect for true HR talentProjecting future ERA
SIERASkill-interactive expected runsAccounts for skill interactionsComplex calculationOverall skill evaluation

Predictive Accuracy Ranking (for next year's ERA):

  1. SIERA (best predictor)
  2. xFIP
  3. FIP
  4. ERA (worst predictor)

Practical Example:

2023 Blake Snell (CY Young winner):


  • ERA: 2.25 (excellent results)

  • FIP: 3.44 (good, but not elite)

  • xFIP: 3.86 (above average, not great)

  • SIERA: 3.52 (solid, but elevated)

Interpretation: Snell had a fantastic ERA, but FIP/xFIP/SIERA suggest he benefited from good fortune (low BABIP, excellent sequencing). Expect some regression in 2024.


R
FIP = ((13×HR + 3×BB + 3×HBP - 2×K) / IP) + FIP_constant
R
library(baseballr)
library(dplyr)

# Get 2023 pitching data
pitching_2023 <- fg_pitcher_leaders(2023, 2023, qual = 100)
guts_2023 <- fg_guts(2023)

# Get FIP constant
fip_constant <- guts_2023$cFIP

cat(sprintf("2023 FIP Constant: %.2f\n", fip_constant))

# Calculate ERA-FIP difference
pitching_2023 <- pitching_2023 %>%
  mutate(ERA_FIP_diff = ERA - FIP)

# Lucky pitchers (ERA better than FIP)
lucky_pitchers <- pitching_2023 %>%
  arrange(ERA_FIP_diff) %>%
  select(Name, Team, IP, ERA, FIP, ERA_FIP_diff, BABIP, `LOB%`) %>%
  head(10)

cat("\nTop 10 Pitchers: ERA Much Better Than FIP (Likely Lucky):\n")
print(lucky_pitchers)

# Unlucky pitchers (ERA worse than FIP)
unlucky_pitchers <- pitching_2023 %>%
  arrange(desc(ERA_FIP_diff)) %>%
  select(Name, Team, IP, ERA, FIP, ERA_FIP_diff, BABIP, `LOB%`) %>%
  head(10)

cat("\nTop 10 Pitchers: ERA Much Worse Than FIP (Likely Unlucky):\n")
print(unlucky_pitchers)

# Best by FIP
top_fip <- pitching_2023 %>%
  arrange(FIP) %>%
  select(Name, Team, IP, ERA, FIP, `K/9`, `BB/9`, `HR/9`, WAR) %>%
  head(15)

cat("\nTop 15 Pitchers by FIP (2023):\n")
print(top_fip)
R
xFIP = ((13×FlyBalls×league_HR/FB% + 3×BB + 3×HBP - 2×K) / IP) + FIP_constant
R
library(baseballr)
library(dplyr)

# Get 2023 data
pitching_2023 <- fg_pitcher_leaders(2023, 2023, qual = 100)
guts_2023 <- fg_guts(2023)

lg_hrfb <- guts_2023$lg_HR_per_FB
cat(sprintf("2023 League HR/FB%%: %.1f%%\n", lg_hrfb * 100))

# Calculate differences
pitching_2023 <- pitching_2023 %>%
  mutate(FIP_xFIP_diff = FIP - xFIP)

# Low HR/FB% pitchers
lucky_hr <- pitching_2023 %>%
  arrange(FIP_xFIP_diff) %>%
  select(Name, Team, IP, FIP, xFIP, `HR/9`, `HR/FB`, `FB%`) %>%
  head(10)

cat("\nPitchers with FIP Better Than xFIP (Low HR/FB%):\n")
print(lucky_hr)

# High HR/FB% pitchers
unlucky_hr <- pitching_2023 %>%
  arrange(desc(FIP_xFIP_diff)) %>%
  select(Name, Team, IP, FIP, xFIP, `HR/9`, `HR/FB`, `FB%`) %>%
  head(10)

cat("\nPitchers with FIP Worse Than xFIP (High HR/FB%):\n")
print(unlucky_hr)

# Best by xFIP
top_xfip <- pitching_2023 %>%
  arrange(xFIP) %>%
  select(Name, Team, IP, ERA, FIP, xFIP, `K/9`, `BB/9`, WAR) %>%
  head(15)

cat("\nTop 15 Pitchers by xFIP (2023):\n")
print(top_xfip)
Python
import pybaseball as pyb
from pybaseball import pitching_stats, fg_guts
import pandas as pd

pyb.cache.enable()

# Get 2023 pitching data and constants
pitching_2023 = pitching_stats(2023, qual=100)  # 100 IP qualified
guts = fg_guts(2023)

# Get FIP constant
fip_constant = guts['cFIP'].values[0]
print(f"2023 FIP Constant: {fip_constant:.2f}")

# Calculate FIP manually (should match FanGraphs)
pitching_2023['FIP_manual'] = (
    ((13 * pitching_2023['HR'] +
      3 * (pitching_2023['BB'] + pitching_2023['HBP']) -
      2 * pitching_2023['SO']) / pitching_2023['IP']) + fip_constant
)

# Compare ERA to FIP
pitching_2023['ERA_FIP_diff'] = pitching_2023['ERA'] - pitching_2023['FIP']

# Show pitchers who outperformed FIP (likely lucky)
lucky_pitchers = pitching_2023.nsmallest(10, 'ERA_FIP_diff')[
    ['Name', 'Team', 'IP', 'ERA', 'FIP', 'ERA_FIP_diff', 'BABIP', 'LOB%']
]
print("\nTop 10 Pitchers: ERA Much Better Than FIP (Likely Lucky):")
print(lucky_pitchers.to_string(index=False))

# Show pitchers who underperformed FIP (likely unlucky)
unlucky_pitchers = pitching_2023.nlargest(10, 'ERA_FIP_diff')[
    ['Name', 'Team', 'IP', 'ERA', 'FIP', 'ERA_FIP_diff', 'BABIP', 'LOB%']
]
print("\nTop 10 Pitchers: ERA Much Worse Than FIP (Likely Unlucky):")
print(unlucky_pitchers.to_string(index=False))

# Best pitchers by FIP
top_fip = pitching_2023.nsmallest(15, 'FIP')[
    ['Name', 'Team', 'IP', 'ERA', 'FIP', 'K/9', 'BB/9', 'HR/9', 'WAR']
]
print("\nTop 15 Pitchers by FIP (2023):")
print(top_fip.to_string(index=False))

# FIP Scale interpretation
print("\nFIP Scale (similar to ERA):")
print("< 3.00: Excellent")
print("3.00-3.75: Above Average")
print("3.75-4.00: Average")
print("4.00-4.50: Below Average")
print("> 4.50: Poor")
Python
import pybaseball as pyb
from pybaseball import pitching_stats, fg_guts

pyb.cache.enable()

# Get 2023 data
pitching_2023 = pitching_stats(2023, qual=100)
guts = fg_guts(2023)

# League average HR/FB%
lg_hrfb = guts['lg_HR_per_FB'].values[0]
print(f"2023 League HR/FB%: {lg_hrfb:.1%}")

# xFIP is already in the data, but let's compare FIP vs xFIP
pitching_2023['FIP_xFIP_diff'] = pitching_2023['FIP'] - pitching_2023['xFIP']

# Pitchers with FIP much better than xFIP (allowing fewer HRs than expected)
lucky_hr = pitching_2023.nsmallest(10, 'FIP_xFIP_diff')[
    ['Name', 'Team', 'IP', 'FIP', 'xFIP', 'HR/9', 'HR/FB', 'FB%']
]
print("\nPitchers with FIP Better Than xFIP (Low HR/FB%):")
print(lucky_hr.to_string(index=False))

# Pitchers with FIP much worse than xFIP (allowing more HRs than expected)
unlucky_hr = pitching_2023.nlargest(10, 'FIP_xFIP_diff')[
    ['Name', 'Team', 'IP', 'FIP', 'xFIP', 'HR/9', 'HR/FB', 'FB%']
]
print("\nPitchers with FIP Worse Than xFIP (High HR/FB%):")
print(unlucky_hr.to_string(index=False))

# Compare ERA, FIP, and xFIP correlations
print("\nCorrelation with Next Year's ERA:")
print("(Note: This requires multi-year data to calculate properly)")
print("Generally: xFIP > FIP > ERA for predictive accuracy")

# Best pitchers by xFIP
top_xfip = pitching_2023.nsmallest(15, 'xFIP')[
    ['Name', 'Team', 'IP', 'ERA', 'FIP', 'xFIP', 'K/9', 'BB/9', 'WAR']
]
print("\nTop 15 Pitchers by xFIP (2023):")
print(top_xfip.to_string(index=False))
Python
# SIERA is provided by FanGraphs
pitching_2023 = pitching_stats(2023, qual=100)

# Compare SIERA to other metrics
comparison = pitching_2023[['Name', 'IP', 'ERA', 'FIP', 'xFIP', 'SIERA', 'K%', 'BB%', 'GB%']].head(20)
print(comparison)

5.7 Wins Above Replacement (WAR)

5.7.1 The Concept: All-in-One Value Metric, Replacement Level

WAR (Wins Above Replacement) attempts to answer the ultimate question: "How many wins does this player add to his team compared to a readily available replacement?"

Core Concepts:

Replacement Level: A freely available minor league call-up or waiver claim. Approximately:


  • Position players: ~20 wins below average per 600 PA (roughly .300 winning %)

  • Pitchers: ~20 wins below average per 200 IP

Components: WAR aggregates:


  • Batting value (wRAA for hitters)

  • Baserunning value

  • Fielding/positional value

  • Pitching value (for pitchers)

  • Adjustments for league, park, and position

The Scale:


  • 0 WAR = Replacement level (should be DFA'd or sent down)

  • 2 WAR = Average regular

  • 5 WAR = All-Star

  • 8+ WAR = MVP candidate

5.7.2 WAR Versions Comparison

Three major WAR implementations exist, each with different methodologies:

WAR Versions Table:

VersionOrganizationBattingBaserunningFieldingPitchingAdjustments
fWARFanGraphswRC+ based on wOBABsR (baserunning runs)UZR (Ultimate Zone Rating)FIP-basedPark, league, position
bWARBaseball ReferenceRuns createdBsR alternativeDRS (Defensive Runs Saved)RA9-based (runs allowed)Park, league, position
WARPBaseball ProspectusDRC+ (Deserved Runs Created)BRR (Baserunning Runs)FRAA (Fielding Runs Above Average)DRA (Deserved Run Average)Park, league, catcher framing

Key Differences:

  • Fielding: UZR vs. DRS vs. FRAA all measure defense differently
  • Pitching: fWAR uses FIP (skills), bWAR uses actual runs allowed (results)
  • Scaling: Minor differences in replacement level and run-to-wins conversion

Which to Use?

  • fWAR: Best for isolating player skill (defense-independent pitching)
  • bWAR: Best for describing what actually happened (actual runs allowed)
  • WARP: Innovative approaches (catcher framing), but less commonly used

For most purposes, fWAR and bWAR are within 0.5-1.0 wins of each other. The choice matters less than understanding what each measures.

5.7.3 Components of Position Player WAR (fWAR)

Let's break down fWAR for position players step-by-step:

fWAR Formula:

fWAR = (Batting + Base Running + Fielding + Positional Adjustment + League Adjustment + Replacement Level) / Runs Per Win

Components Explained:

  1. Batting Runs (wRAA): Weighted runs above average from hitting (covered in section 5.4.3)
  1. Base Running Runs (BsR): Value from stolen bases, taking extra bases, avoiding double plays
  1. Fielding Runs (UZR or DRS): Defensive value above/below average at their position
  1. Positional Adjustment: Adjusts for positional difficulty
  • Catcher: +12.5 runs/162 games (hardest)
  • Shortstop: +7.5 runs
  • Center Field: +2.5 runs
  • First Base: -12.5 runs (easiest)
  1. League Adjustment: AL/NL differences (minimal in recent years)
  1. Replacement Level: ~20 runs below average per 600 PA
  1. Runs Per Win: Approximately 10 runs = 1 win

Python Implementation:

import pybaseball as pyb
from pybaseball import batting_stats
import pandas as pd

pyb.cache.enable()

# Get 2023 batting data
batting_2023 = batting_stats(2023, qual=502)

# FanGraphs provides the components
# Let's examine them for top players

war_breakdown = batting_2023.nlargest(15, 'WAR')[
    ['Name', 'Team', 'PA', 'wRAA', 'BsR', 'Def', 'WAR']
]

print("Top 15 Position Players by WAR - Component Breakdown:")
print(war_breakdown.to_string(index=False))

# Manual WAR calculation (simplified)
# Note: This is approximate; actual fWAR has more adjustments

# Positional adjustments (runs per 600 PA)
pos_adj = {
    'C': 12.5, 'SS': 7.5, '2B': 3, 'CF': 2.5, '3B': 2,
    'LF': -7.5, 'RF': -7.5, '1B': -12.5, 'DH': -17.5
}

# Example: Calculate WAR manually for Ronald Acuña Jr.
acuna = batting_2023[batting_2023['Name'] == 'Ronald Acuna'].iloc[0]

print(f"\nRonald Acuña Jr. 2023 WAR Breakdown:")
print(f"Batting (wRAA): {acuna['wRAA']:.1f} runs")
print(f"Baserunning (BsR): {acuna['BsR']:.1f} runs")
print(f"Fielding (Def): {acuna['Def']:.1f} runs")
print(f"Position (RF): ~-7.5 runs (easier position)")
print(f"Replacement Level: ~+20 runs (vs. replacement)")
print(f"\nTotal Runs Above Replacement: ~80-85 runs")
print(f"Runs Per Win: ~10")
print(f"Estimated WAR: ~8.0-8.5 wins")
print(f"Actual fWAR: {acuna['WAR']:.1f} wins")

# WAR leaders by position
print("\n2023 WAR Leaders by Position:")
for pos in ['C', '1B', '2B', '3B', 'SS', 'OF']:
    pos_leaders = batting_2023[batting_2023['Pos'].str.contains(pos, na=False)].nlargest(3, 'WAR')
    print(f"\n{pos}:")
    for idx, player in pos_leaders.iterrows():
        print(f"  {player['Name']}: {player['WAR']:.1f} WAR")

R Implementation:

library(baseballr)
library(dplyr)

# Get 2023 batting data
batting_2023 <- fg_batter_leaders(2023, 2023, qual = 502)

# WAR component breakdown
war_breakdown <- batting_2023 %>%
  select(Name, Team, PA, wRAA, BsR, Def, WAR) %>%
  arrange(desc(WAR)) %>%
  head(15)

cat("Top 15 Position Players by WAR - Component Breakdown:\n")
print(war_breakdown)

# Examine Ronald Acuña Jr.
acuna <- batting_2023 %>% filter(grepl("Acuña", Name))

if(nrow(acuna) > 0) {
  cat("\nRonald Acuña Jr. 2023 WAR Breakdown:\n")
  cat(sprintf("Batting (wRAA): %.1f runs\n", acuna$wRAA))
  cat(sprintf("Baserunning (BsR): %.1f runs\n", acuna$BsR))
  cat(sprintf("Fielding (Def): %.1f runs\n", acuna$Def))
  cat(sprintf("Actual fWAR: %.1f wins\n", acuna$WAR))
}

# WAR leaders by position
cat("\n2023 WAR Leaders by Position:\n")
positions <- c("C", "1B", "2B", "3B", "SS", "OF")

for(pos in positions) {
  cat(sprintf("\n%s:\n", pos))
  pos_leaders <- batting_2023 %>%
    filter(grepl(pos, Pos)) %>%
    arrange(desc(WAR)) %>%
    head(3) %>%
    select(Name, WAR)
  print(pos_leaders)
}

5.7.4 Components of Pitcher WAR

Pitcher WAR is conceptually simpler than position player WAR:

Pitcher fWAR Formula:

fWAR = (FIP_Runs - Replacement_Level_Runs) / Runs_Per_Win

Where:


  • FIP_Runs: Runs prevented based on FIP compared to league average

  • Replacement Level: ~20 runs worse than average per 200 IP

  • Runs Per Win: ~10 runs = 1 win

Python Implementation:

import pybaseball as pyb
from pybaseball import pitching_stats

pyb.cache.enable()

# Get 2023 pitching data
pitching_2023 = pitching_stats(2023, qual=100)

# Top pitchers by WAR
war_leaders = pitching_2023.nlargest(20, 'WAR')[
    ['Name', 'Team', 'IP', 'ERA', 'FIP', 'K/9', 'BB/9', 'WAR']
]

print("Top 20 Pitchers by fWAR (2023):")
print(war_leaders.to_string(index=False))

# Compare fWAR (FIP-based) to bWAR (RA9-based) if available
# Note: bWAR is from Baseball Reference, not FanGraphs
# This comparison would require merging datasets

print("\nTop Pitchers Breakdown:")
for idx, pitcher in war_leaders.head(5).iterrows():
    print(f"\n{pitcher['Name']}:")
    print(f"  IP: {pitcher['IP']:.1f}")
    print(f"  ERA: {pitcher['ERA']:.2f}")
    print(f"  FIP: {pitcher['FIP']:.2f}")
    print(f"  K/9: {pitcher['K/9']:.2f}")
    print(f"  BB/9: {pitcher['BB/9']:.2f}")
    print(f"  fWAR: {pitcher['WAR']:.1f}")

R Implementation:

library(baseballr)
library(dplyr)

# Get 2023 pitching data
pitching_2023 <- fg_pitcher_leaders(2023, 2023, qual = 100)

# Top pitchers by WAR
war_leaders <- pitching_2023 %>%
  select(Name, Team, IP, ERA, FIP, `K/9`, `BB/9`, WAR) %>%
  arrange(desc(WAR)) %>%
  head(20)

cat("Top 20 Pitchers by fWAR (2023):\n")
print(war_leaders)

# Detailed breakdown
cat("\nTop Pitchers Breakdown:\n")
for(i in 1:5) {
  pitcher <- war_leaders[i, ]
  cat(sprintf("\n%s:\n", pitcher$Name))
  cat(sprintf("  IP: %.1f\n", pitcher$IP))
  cat(sprintf("  ERA: %.2f\n", pitcher$ERA))
  cat(sprintf("  FIP: %.2f\n", pitcher$FIP))
  cat(sprintf("  K/9: %.2f\n", pitcher$`K/9`))
  cat(sprintf("  BB/9: %.2f\n", pitcher$`BB/9`))
  cat(sprintf("  fWAR: %.1f\n", pitcher$WAR))
}

5.7.5 WAR Interpretation Table

WAR Value Scale:

WAR RangePlayer QualityDescriptionExamples (Recent)
8+MVPHistorically great seasonRonald Acuña (8.3), Shohei Ohtani (9.0+)
5-8SuperstarAll-Star; franchise cornerstoneMookie Betts (6.5), Aaron Judge (7.0)
3-5All-StarHigh-quality regular; All-Star caliberFreddie Freeman (5.0), Marcus Semien (4.2)
2-3Solid StarterAbove-average regular; valuable playerMost everyday starters
1-2Role PlayerAverage to slightly below; replacement level +Bench/platoon players
0-1BenchReplacement level; minimal valueDeep bench, frequent DFA
<0Below ReplacementActively hurting the teamShould be in minors

Dollar Value Approximation:

The free agent market values WAR at approximately $8-10 million per win (2023 values):


  • 1 WAR = ~$8M per year

  • 5 WAR = ~$40M per year (superstar contract)

  • 8 WAR = ~$64M per year (would be underpaid at any salary)

This helps evaluate contracts: Is a player worth their salary?

Cumulative WAR (Career Value):

  • 50+ WAR: Likely Hall of Famer
  • 30-50 WAR: Excellent career; borderline HOF
  • 20-30 WAR: Solid career
  • 10-20 WAR: Useful career

5.7.6 WAR Caveats and Limitations

While WAR is powerful, it has important limitations:

1. Uncertainty Margins:

WAR has an error bar of approximately ±1 win. A 5.0 WAR and 5.8 WAR player are essentially equivalent—the difference is within measurement error.

2. Defensive Metrics Vary:

UZR and DRS often disagree significantly on individual players. A player might have +10 UZR and +2 DRS in the same season, creating fWAR/bWAR discrepancies.

3. Context Missing:

WAR doesn't capture:


  • Clutch performance (winning vs. losing situations)

  • Leadership and clubhouse value

  • Playoff performance (separate metric: Championship WPA)

  • Injury risk and durability projections

4. Small Sample Noise:

Defense and baserunning metrics are particularly noisy in small samples. A 50-game stretch of WAR is heavily influenced by luck.

5. Pitching Philosophy Differences:

fWAR (FIP-based) vs. bWAR (actual runs) can differ by 2+ wins for some pitchers. Elite defenders behind a soft-contact pitcher will help bWAR but not fWAR.

Best Practices:

  • Use WAR for broad comparisons (5 WAR vs. 2 WAR is meaningful)
  • Don't over-index on small differences (5.2 vs. 5.8 WAR is noise)
  • Check multiple versions (fWAR, bWAR) and understand why they differ
  • Supplement with context (park factors, defensive support, health)
  • Use multi-year averages when possible (reduces noise)

R
fWAR = (Batting + Base Running + Fielding + Positional Adjustment + League Adjustment + Replacement Level) / Runs Per Win
R
library(baseballr)
library(dplyr)

# Get 2023 batting data
batting_2023 <- fg_batter_leaders(2023, 2023, qual = 502)

# WAR component breakdown
war_breakdown <- batting_2023 %>%
  select(Name, Team, PA, wRAA, BsR, Def, WAR) %>%
  arrange(desc(WAR)) %>%
  head(15)

cat("Top 15 Position Players by WAR - Component Breakdown:\n")
print(war_breakdown)

# Examine Ronald Acuña Jr.
acuna <- batting_2023 %>% filter(grepl("Acuña", Name))

if(nrow(acuna) > 0) {
  cat("\nRonald Acuña Jr. 2023 WAR Breakdown:\n")
  cat(sprintf("Batting (wRAA): %.1f runs\n", acuna$wRAA))
  cat(sprintf("Baserunning (BsR): %.1f runs\n", acuna$BsR))
  cat(sprintf("Fielding (Def): %.1f runs\n", acuna$Def))
  cat(sprintf("Actual fWAR: %.1f wins\n", acuna$WAR))
}

# WAR leaders by position
cat("\n2023 WAR Leaders by Position:\n")
positions <- c("C", "1B", "2B", "3B", "SS", "OF")

for(pos in positions) {
  cat(sprintf("\n%s:\n", pos))
  pos_leaders <- batting_2023 %>%
    filter(grepl(pos, Pos)) %>%
    arrange(desc(WAR)) %>%
    head(3) %>%
    select(Name, WAR)
  print(pos_leaders)
}
R
fWAR = (FIP_Runs - Replacement_Level_Runs) / Runs_Per_Win
R
library(baseballr)
library(dplyr)

# Get 2023 pitching data
pitching_2023 <- fg_pitcher_leaders(2023, 2023, qual = 100)

# Top pitchers by WAR
war_leaders <- pitching_2023 %>%
  select(Name, Team, IP, ERA, FIP, `K/9`, `BB/9`, WAR) %>%
  arrange(desc(WAR)) %>%
  head(20)

cat("Top 20 Pitchers by fWAR (2023):\n")
print(war_leaders)

# Detailed breakdown
cat("\nTop Pitchers Breakdown:\n")
for(i in 1:5) {
  pitcher <- war_leaders[i, ]
  cat(sprintf("\n%s:\n", pitcher$Name))
  cat(sprintf("  IP: %.1f\n", pitcher$IP))
  cat(sprintf("  ERA: %.2f\n", pitcher$ERA))
  cat(sprintf("  FIP: %.2f\n", pitcher$FIP))
  cat(sprintf("  K/9: %.2f\n", pitcher$`K/9`))
  cat(sprintf("  BB/9: %.2f\n", pitcher$`BB/9`))
  cat(sprintf("  fWAR: %.1f\n", pitcher$WAR))
}
Python
import pybaseball as pyb
from pybaseball import batting_stats
import pandas as pd

pyb.cache.enable()

# Get 2023 batting data
batting_2023 = batting_stats(2023, qual=502)

# FanGraphs provides the components
# Let's examine them for top players

war_breakdown = batting_2023.nlargest(15, 'WAR')[
    ['Name', 'Team', 'PA', 'wRAA', 'BsR', 'Def', 'WAR']
]

print("Top 15 Position Players by WAR - Component Breakdown:")
print(war_breakdown.to_string(index=False))

# Manual WAR calculation (simplified)
# Note: This is approximate; actual fWAR has more adjustments

# Positional adjustments (runs per 600 PA)
pos_adj = {
    'C': 12.5, 'SS': 7.5, '2B': 3, 'CF': 2.5, '3B': 2,
    'LF': -7.5, 'RF': -7.5, '1B': -12.5, 'DH': -17.5
}

# Example: Calculate WAR manually for Ronald Acuña Jr.
acuna = batting_2023[batting_2023['Name'] == 'Ronald Acuna'].iloc[0]

print(f"\nRonald Acuña Jr. 2023 WAR Breakdown:")
print(f"Batting (wRAA): {acuna['wRAA']:.1f} runs")
print(f"Baserunning (BsR): {acuna['BsR']:.1f} runs")
print(f"Fielding (Def): {acuna['Def']:.1f} runs")
print(f"Position (RF): ~-7.5 runs (easier position)")
print(f"Replacement Level: ~+20 runs (vs. replacement)")
print(f"\nTotal Runs Above Replacement: ~80-85 runs")
print(f"Runs Per Win: ~10")
print(f"Estimated WAR: ~8.0-8.5 wins")
print(f"Actual fWAR: {acuna['WAR']:.1f} wins")

# WAR leaders by position
print("\n2023 WAR Leaders by Position:")
for pos in ['C', '1B', '2B', '3B', 'SS', 'OF']:
    pos_leaders = batting_2023[batting_2023['Pos'].str.contains(pos, na=False)].nlargest(3, 'WAR')
    print(f"\n{pos}:")
    for idx, player in pos_leaders.iterrows():
        print(f"  {player['Name']}: {player['WAR']:.1f} WAR")
Python
import pybaseball as pyb
from pybaseball import pitching_stats

pyb.cache.enable()

# Get 2023 pitching data
pitching_2023 = pitching_stats(2023, qual=100)

# Top pitchers by WAR
war_leaders = pitching_2023.nlargest(20, 'WAR')[
    ['Name', 'Team', 'IP', 'ERA', 'FIP', 'K/9', 'BB/9', 'WAR']
]

print("Top 20 Pitchers by fWAR (2023):")
print(war_leaders.to_string(index=False))

# Compare fWAR (FIP-based) to bWAR (RA9-based) if available
# Note: bWAR is from Baseball Reference, not FanGraphs
# This comparison would require merging datasets

print("\nTop Pitchers Breakdown:")
for idx, pitcher in war_leaders.head(5).iterrows():
    print(f"\n{pitcher['Name']}:")
    print(f"  IP: {pitcher['IP']:.1f}")
    print(f"  ERA: {pitcher['ERA']:.2f}")
    print(f"  FIP: {pitcher['FIP']:.2f}")
    print(f"  K/9: {pitcher['K/9']:.2f}")
    print(f"  BB/9: {pitcher['BB/9']:.2f}")
    print(f"  fWAR: {pitcher['WAR']:.1f}")

5.8 Park Factors

5.8.1 Why Parks Matter: Dimensions, Altitude, Weather

Not all ballparks are created equal. Physical characteristics dramatically affect offense:

Coors Field (Colorado Rockies):


  • Altitude: 5,280 feet above sea level

  • Effect: Thin air reduces drag on batted balls; curveballs break less

  • Impact: ~15% increase in run scoring

Oracle Park (San Francisco Giants):


  • Dimensions: Deep power alleys, high walls in right field

  • Weather: Cold, windy, marine layer suppresses fly balls

  • Impact: ~10% decrease in run scoring, especially right-handed power

Fenway Park (Boston Red Sox):


  • Feature: Green Monster (37-foot wall in left field)

  • Effect: Suppresses left field home runs, increases doubles

  • Impact: Helps right-handed hitters, hurts lefties

Other Factors:

  • Outfield dimensions: Short porches (Yankee Stadium RF) vs. vast outfields (Oracle Park)
  • Wall heights: High walls turn HRs into doubles
  • Climate: Humid air in Cincinnati; dry air in Arizona
  • Day/night: Wrigley Field's day game schedule affects visibility

5.8.2 Park Factor Methodology

Park factors compare offensive performance at a specific park to the road (neutral environment):

Basic Park Factor Formula:

Park_Factor = (Home_Runs_Scored + Home_Runs_Allowed) / (Road_Runs_Scored + Road_Runs_Allowed)

Then scale to 100:


  • 100 = Neutral park

  • 110 = 10% more runs than average

  • 90 = 10% fewer runs than average

Multi-Year Approach:

Single-season park factors are noisy. Best practice: Use 3-5 year averages to smooth variance.

Handedness Splits:

Some parks favor left-handed or right-handed hitters differently. Advanced park factors account for this.

Python Implementation:

import pybaseball as pyb
from pybaseball import batting_stats
import pandas as pd

pyb.cache.enable()

# FanGraphs includes basic park factors in team data
# For detailed park factors, we can reference the FanGraphs park factors page
# or calculate manually from home/road splits

# Get team batting data with park factors
batting_2023 = batting_stats(2023, qual=1)

# Display park factor data (simplified example)
# Note: Actual park factors require multi-year data

# Example: Manual calculation for one team
# This would require home/road split data

print("2023 Park Factors (Approximate):")
print("Higher = hitter-friendly; Lower = pitcher-friendly")
print("\nNotable Parks:")
print("Coors Field (COL): 115 (very hitter-friendly)")
print("Great American Ball Park (CIN): 108")
print("Yankee Stadium (NYY): 103")
print("Average Park: 100")
print("Oracle Park (SF): 91 (pitcher-friendly)")
print("T-Mobile Park (SEA): 93")
print("Marlins Park (MIA): 94")

# For actual park factor data, refer to:
# FanGraphs Guts page: https://www.fangraphs.com/guts.aspx?type=pf
# Baseball Reference Park Factors: https://www.baseball-reference.com/leagues/majors/

R Implementation:

library(baseballr)

# Get park factors from FanGraphs
# Note: baseballr has limited park factor functionality
# Typically reference FanGraphs directly

cat("2023 MLB Park Factors (Approximate):\n")
cat("Scale: 100 = neutral, >100 = hitter-friendly, <100 = pitcher-friendly\n\n")

park_factors <- data.frame(
  Team = c("COL", "CIN", "TEX", "CHC", "NYY", "LAA",
           "MIA", "OAK", "SEA", "SF"),
  Park = c("Coors Field", "Great American", "Globe Life",
           "Wrigley", "Yankee Stadium", "Angel Stadium",
           "LoanDepot", "Oakland Coliseum", "T-Mobile", "Oracle"),
  Factor = c(115, 108, 104, 103, 103, 101, 94, 93, 93, 91),
  Type = c("Extreme Hitter", "Hitter", "Slight Hitter", "Neutral", "Neutral", "Neutral",
           "Pitcher", "Pitcher", "Pitcher", "Strong Pitcher")
)

print(park_factors)

cat("\nInterpretation:\n")
cat("- Playing 81 games at Coors adds ~15% to offensive stats\n")
cat("- Playing 81 games at Oracle subtracts ~9% from offensive stats\n")

5.8.3 Using Park Factors in Analysis

Adjusting Individual Stats:

To neutralize park effects on a player's statistics:

Park_Adjusted_Stat = (Stat × 100) / Park_Factor

Example:

A Rockies hitter with .300 AVG, 30 HR, playing in a 115 park factor:


  • Adjusted AVG: .300 × (100/115) = .261 (significant adjustment)

  • Adjusted HR: 30 × (100/115) = 26 HR

Python Implementation:

import pandas as pd

# Example: Adjust Rockies hitters for park
coors_factor = 115

# Create sample data
rockies_hitters = pd.DataFrame({
    'Name': ['Ryan McMahon', 'Ezequiel Tovar', 'Charlie Blackmon'],
    'AVG': [.254, .268, .260],
    'HR': [20, 15, 17],
    'wOBA': [.330, .315, .325]
})

# Apply park adjustment
rockies_hitters['Adj_AVG'] = rockies_hitters['AVG'] * (100 / coors_factor)
rockies_hitters['Adj_HR'] = rockies_hitters['HR'] * (100 / coors_factor)
rockies_hitters['Adj_wOBA'] = rockies_hitters['wOBA'] * (100 / coors_factor)

print("Rockies Hitters - Park Adjusted (2023 Example):")
print(rockies_hitters.round(3))

print("\nNote: Advanced metrics like wRC+ already include park adjustments!")

Important: Metrics like wRC+, OPS+, ERA+, and WAR already include park adjustments. Don't double-adjust!

5.8.4 Current Park Effects Table (2023)

2023 MLB Park Factors (Runs):

RankTeamParkFactorEffect
1COLCoors Field115Extreme hitter-friendly
2CINGreat American Ball Park108Hitter-friendly
3TEXGlobe Life Field104Slight hitter-friendly
4CHCWrigley Field103Neutral (slight hitter)
5NYYYankee Stadium103Neutral (slight hitter)
15LADDodger Stadium100True neutral
26OAKOakland Coliseum93Pitcher-friendly
27SEAT-Mobile Park93Pitcher-friendly
28MIALoanDepot Park94Pitcher-friendly
29SDPetco Park92Pitcher-friendly
30SFOracle Park91Very pitcher-friendly

Specialized Effects:

Some parks have asymmetric effects:

  • Yankee Stadium: Very friendly to RHH (short porch in RF), neutral for LHH
  • Fenway Park: Green Monster helps RHH doubles, suppresses RHH HRs
  • Coors Field: Affects everything (hits, HR, strikeouts—curveballs don't break as much)

R
Park_Factor = (Home_Runs_Scored + Home_Runs_Allowed) / (Road_Runs_Scored + Road_Runs_Allowed)
R
library(baseballr)

# Get park factors from FanGraphs
# Note: baseballr has limited park factor functionality
# Typically reference FanGraphs directly

cat("2023 MLB Park Factors (Approximate):\n")
cat("Scale: 100 = neutral, >100 = hitter-friendly, <100 = pitcher-friendly\n\n")

park_factors <- data.frame(
  Team = c("COL", "CIN", "TEX", "CHC", "NYY", "LAA",
           "MIA", "OAK", "SEA", "SF"),
  Park = c("Coors Field", "Great American", "Globe Life",
           "Wrigley", "Yankee Stadium", "Angel Stadium",
           "LoanDepot", "Oakland Coliseum", "T-Mobile", "Oracle"),
  Factor = c(115, 108, 104, 103, 103, 101, 94, 93, 93, 91),
  Type = c("Extreme Hitter", "Hitter", "Slight Hitter", "Neutral", "Neutral", "Neutral",
           "Pitcher", "Pitcher", "Pitcher", "Strong Pitcher")
)

print(park_factors)

cat("\nInterpretation:\n")
cat("- Playing 81 games at Coors adds ~15% to offensive stats\n")
cat("- Playing 81 games at Oracle subtracts ~9% from offensive stats\n")
R
Park_Adjusted_Stat = (Stat × 100) / Park_Factor
Python
import pybaseball as pyb
from pybaseball import batting_stats
import pandas as pd

pyb.cache.enable()

# FanGraphs includes basic park factors in team data
# For detailed park factors, we can reference the FanGraphs park factors page
# or calculate manually from home/road splits

# Get team batting data with park factors
batting_2023 = batting_stats(2023, qual=1)

# Display park factor data (simplified example)
# Note: Actual park factors require multi-year data

# Example: Manual calculation for one team
# This would require home/road split data

print("2023 Park Factors (Approximate):")
print("Higher = hitter-friendly; Lower = pitcher-friendly")
print("\nNotable Parks:")
print("Coors Field (COL): 115 (very hitter-friendly)")
print("Great American Ball Park (CIN): 108")
print("Yankee Stadium (NYY): 103")
print("Average Park: 100")
print("Oracle Park (SF): 91 (pitcher-friendly)")
print("T-Mobile Park (SEA): 93")
print("Marlins Park (MIA): 94")

# For actual park factor data, refer to:
# FanGraphs Guts page: https://www.fangraphs.com/guts.aspx?type=pf
# Baseball Reference Park Factors: https://www.baseball-reference.com/leagues/majors/
Python
import pandas as pd

# Example: Adjust Rockies hitters for park
coors_factor = 115

# Create sample data
rockies_hitters = pd.DataFrame({
    'Name': ['Ryan McMahon', 'Ezequiel Tovar', 'Charlie Blackmon'],
    'AVG': [.254, .268, .260],
    'HR': [20, 15, 17],
    'wOBA': [.330, .315, .325]
})

# Apply park adjustment
rockies_hitters['Adj_AVG'] = rockies_hitters['AVG'] * (100 / coors_factor)
rockies_hitters['Adj_HR'] = rockies_hitters['HR'] * (100 / coors_factor)
rockies_hitters['Adj_wOBA'] = rockies_hitters['wOBA'] * (100 / coors_factor)

print("Rockies Hitters - Park Adjusted (2023 Example):")
print(rockies_hitters.round(3))

print("\nNote: Advanced metrics like wRC+ already include park adjustments!")

5.9 Run Expectancy and Win Probability

5.9.1 Run Expectancy Matrices

Run expectancy (RE) quantifies the value of each base-out state. How many runs does a team score on average from "runner on first, no outs" versus "bases loaded, two outs"?

2023 Run Expectancy Matrix:

Base State0 Outs1 Out2 Outs
Empty (---)0.4810.2540.098
1st (1--)0.8590.5130.213
2nd (-2-)1.1000.6640.315
3rd (--3)1.3270.9080.362
1st & 2nd (12-)1.4370.8970.430
1st & 3rd (1-3)1.7841.1710.489
2nd & 3rd (-23)1.9641.3520.592
Loaded (123)2.2541.5410.736

Interpretation:

  • Bases empty, 0 outs: Teams score 0.481 runs on average for the rest of the inning
  • Runner on 2nd, 0 outs: Teams score 1.100 runs on average
  • Bases loaded, 2 outs: Teams score 0.736 runs on average

RE24 (Run Expectancy Based on 24 Base-Out States):

RE24 credits a batter/pitcher with the change in run expectancy from their plate appearance:

RE24 = RE_After - RE_Before + Runs_Scored

Example:

Runner on first, 0 outs. Batter hits a double, advancing runner to third. No runs score.

RE_Before = 0.859 (runner on 1st, 0 outs)
RE_After = 1.964 (runners on 2nd & 3rd, 0 outs)
Runs_Scored = 0
RE24 = 1.964 - 0.859 + 0 = +1.105 runs

The batter receives +1.105 runs of credit for improving the situation.

Python Implementation:

import pandas as pd
import numpy as np

# 2023 Run Expectancy Matrix
re_matrix = pd.DataFrame({
    '0 outs': [0.481, 0.859, 1.100, 1.327, 1.437, 1.784, 1.964, 2.254],
    '1 out':  [0.254, 0.513, 0.664, 0.908, 0.897, 1.171, 1.352, 1.541],
    '2 outs': [0.098, 0.213, 0.315, 0.362, 0.430, 0.489, 0.592, 0.736]
}, index=['---', '1--', '-2-', '--3', '12-', '1-3', '-23', '123'])

print("2023 Run Expectancy Matrix:")
print(re_matrix)

# Function to calculate RE24 for a play
def calculate_re24(state_before, outs_before, state_after, outs_after, runs_scored):
    """
    Calculate RE24 value for a plate appearance

    state_before/after: Base state like '1--', '12-', '---'
    outs_before/after: Number of outs (0, 1, 2)
    runs_scored: Runs scored on the play
    """
    if outs_after >= 3:  # Inning ended
        re_after = 0
    else:
        re_after = re_matrix.loc[state_after, f'{outs_after} outs']

    re_before = re_matrix.loc[state_before, f'{outs_before} outs']

    re24 = re_after - re_before + runs_scored
    return re24

# Example calculations
print("\n--- Example RE24 Calculations ---")

# Double with runner on 1st, 0 outs -> runners on 2nd & 3rd, 0 outs
re24_double = calculate_re24('1--', 0, '-23', 0, 0)
print(f"Double (runner 1st to 3rd): +{re24_double:.3f} runs")

# Home run with bases loaded, 1 out -> bases empty, 1 out, 4 runs
re24_grand_slam = calculate_re24('123', 1, '---', 1, 4)
print(f"Grand slam: +{re24_grand_slam:.3f} runs")

# Strikeout with runner on 2nd, 1 out -> runner on 2nd, 2 outs
re24_strikeout = calculate_re24('-2-', 1, '-2-', 2, 0)
print(f"Strikeout (runner on 2nd): {re24_strikeout:.3f} runs")

# Walk with bases loaded, 0 outs -> bases loaded, 0 outs, 1 run
re24_walk = calculate_re24('123', 0, '123', 0, 1)
print(f"Walk (bases loaded): +{re24_walk:.3f} runs")

# Ground ball double play, runner on 1st, 0 outs -> empty, 2 outs
re24_gidp = calculate_re24('1--', 0, '---', 2, 0)
print(f"Double play: {re24_gidp:.3f} runs")

R Implementation:

library(dplyr)

# 2023 Run Expectancy Matrix
re_matrix <- data.frame(
  Base_State = c('---', '1--', '-2-', '--3', '12-', '1-3', '-23', '123'),
  `0_outs` = c(0.481, 0.859, 1.100, 1.327, 1.437, 1.784, 1.964, 2.254),
  `1_out` = c(0.254, 0.513, 0.664, 0.908, 0.897, 1.171, 1.352, 1.541),
  `2_outs` = c(0.098, 0.213, 0.315, 0.362, 0.430, 0.489, 0.592, 0.736)
)

cat("2023 Run Expectancy Matrix:\n")
print(re_matrix)

# Function to calculate RE24
calculate_re24 <- function(state_before, outs_before,
                          state_after, outs_after, runs_scored) {
  if(outs_after >= 3) {
    re_after <- 0
  } else {
    re_after <- re_matrix[re_matrix$Base_State == state_after,
                         paste0("X", outs_after, "_outs")]
  }

  re_before <- re_matrix[re_matrix$Base_State == state_before,
                        paste0("X", outs_before, "_outs")]

  re24 <- re_after - re_before + runs_scored
  return(re24)
}

# Example calculations
cat("\n--- Example RE24 Calculations ---\n")
cat(sprintf("Double (runner 1st to 3rd): +%.3f runs\n",
            calculate_re24('1--', 0, '-23', 0, 0)))
cat(sprintf("Grand slam: +%.3f runs\n",
            calculate_re24('123', 1, '---', 1, 4)))
cat(sprintf("Strikeout (runner on 2nd): %.3f runs\n",
            calculate_re24('-2-', 1, '-2-', 2, 0)))

5.9.2 Leverage Index

Not all situations are created equal. A home run in a 10-0 game matters less than a home run in a tie game in the 9th inning. Leverage Index (LI) quantifies situation importance.

Leverage Index Scale:

  • 1.0 LI: Average leverage (typical early-inning situation)
  • 2.0 LI: High leverage (close game, late innings, runners on)
  • 3.0+ LI: Extreme leverage (9th inning, tie game, bases loaded)
  • 0.5 LI: Low leverage (blowout game)

How It's Calculated:

LI measures how much a plate appearance could change the game's outcome. Situations where the game could swing dramatically (tie game, 9th inning, bases loaded) have high leverage.

Average LI by Inning and Score Difference:

InningTie±1 Run±2 Runs±3+ Runs
1-60.90.70.40.2
71.21.00.60.3
81.61.30.80.4
92.01.81.00.5
Extra2.5+2.21.30.7

WPA/LI (Context-Neutral Wins):

Dividing WPA (Win Probability Added) by LI gives context-neutral value—what would this performance be worth in average leverage?

5.9.3 Win Probability Added (WPA)

WPA measures how much a player changed their team's probability of winning with each plate appearance.

Example:

  • Before: 7th inning, tie game, runner on 2nd, 1 out → 52% win probability
  • Event: Home run (2 runs score)
  • After: Leading by 2, 7th inning → 82% win probability
  • WPA: +30% (+0.30 WPA)

Season WPA Leaders:

Top players accumulate 4-6 WPA over a season. This means they increased their team's win probability by a cumulative 4-6 wins through clutch performance.

WPA vs. WAR:

  • WPA: Measures actual wins added in actual game situations (descriptive, context-dependent)
  • WAR: Measures wins above replacement in neutral context (predictive, context-neutral)

A player can have high WPA (clutch hits in key moments) but average WAR (overall production not exceptional). Conversely, a player with high WAR might have low WPA if they performed poorly in high-leverage situations.

Python Example (Conceptual):

# WPA is calculated from play-by-play data
# pybaseball doesn't directly provide WPA, but we can illustrate the concept

import pandas as pd

# Hypothetical WPA calculation
plate_appearances = pd.DataFrame({
    'PA': [1, 2, 3, 4, 5],
    'Inning': [1, 3, 7, 8, 9],
    'WP_Before': [0.50, 0.48, 0.52, 0.45, 0.38],
    'WP_After': [0.52, 0.46, 0.68, 0.44, 0.41],
    'Event': ['Single', 'Strikeout', 'Home Run', 'Walk', 'Single']
})

plate_appearances['WPA'] = (plate_appearances['WP_After'] -
                            plate_appearances['WP_Before'])

print("Win Probability Added by Plate Appearance:")
print(plate_appearances)
print(f"\nTotal WPA: {plate_appearances['WPA'].sum():.3f}")

# The 7th inning home run added +16% win probability
# That single event contributed significantly to winning the game

Key Insight: WPA is descriptive (what happened), WAR is predictive (true talent). Both are valuable for different purposes.


R
RE24 = RE_After - RE_Before + Runs_Scored
R
RE_Before = 0.859 (runner on 1st, 0 outs)
RE_After = 1.964 (runners on 2nd & 3rd, 0 outs)
Runs_Scored = 0
RE24 = 1.964 - 0.859 + 0 = +1.105 runs
R
library(dplyr)

# 2023 Run Expectancy Matrix
re_matrix <- data.frame(
  Base_State = c('---', '1--', '-2-', '--3', '12-', '1-3', '-23', '123'),
  `0_outs` = c(0.481, 0.859, 1.100, 1.327, 1.437, 1.784, 1.964, 2.254),
  `1_out` = c(0.254, 0.513, 0.664, 0.908, 0.897, 1.171, 1.352, 1.541),
  `2_outs` = c(0.098, 0.213, 0.315, 0.362, 0.430, 0.489, 0.592, 0.736)
)

cat("2023 Run Expectancy Matrix:\n")
print(re_matrix)

# Function to calculate RE24
calculate_re24 <- function(state_before, outs_before,
                          state_after, outs_after, runs_scored) {
  if(outs_after >= 3) {
    re_after <- 0
  } else {
    re_after <- re_matrix[re_matrix$Base_State == state_after,
                         paste0("X", outs_after, "_outs")]
  }

  re_before <- re_matrix[re_matrix$Base_State == state_before,
                        paste0("X", outs_before, "_outs")]

  re24 <- re_after - re_before + runs_scored
  return(re24)
}

# Example calculations
cat("\n--- Example RE24 Calculations ---\n")
cat(sprintf("Double (runner 1st to 3rd): +%.3f runs\n",
            calculate_re24('1--', 0, '-23', 0, 0)))
cat(sprintf("Grand slam: +%.3f runs\n",
            calculate_re24('123', 1, '---', 1, 4)))
cat(sprintf("Strikeout (runner on 2nd): %.3f runs\n",
            calculate_re24('-2-', 1, '-2-', 2, 0)))
Python
import pandas as pd
import numpy as np

# 2023 Run Expectancy Matrix
re_matrix = pd.DataFrame({
    '0 outs': [0.481, 0.859, 1.100, 1.327, 1.437, 1.784, 1.964, 2.254],
    '1 out':  [0.254, 0.513, 0.664, 0.908, 0.897, 1.171, 1.352, 1.541],
    '2 outs': [0.098, 0.213, 0.315, 0.362, 0.430, 0.489, 0.592, 0.736]
}, index=['---', '1--', '-2-', '--3', '12-', '1-3', '-23', '123'])

print("2023 Run Expectancy Matrix:")
print(re_matrix)

# Function to calculate RE24 for a play
def calculate_re24(state_before, outs_before, state_after, outs_after, runs_scored):
    """
    Calculate RE24 value for a plate appearance

    state_before/after: Base state like '1--', '12-', '---'
    outs_before/after: Number of outs (0, 1, 2)
    runs_scored: Runs scored on the play
    """
    if outs_after >= 3:  # Inning ended
        re_after = 0
    else:
        re_after = re_matrix.loc[state_after, f'{outs_after} outs']

    re_before = re_matrix.loc[state_before, f'{outs_before} outs']

    re24 = re_after - re_before + runs_scored
    return re24

# Example calculations
print("\n--- Example RE24 Calculations ---")

# Double with runner on 1st, 0 outs -> runners on 2nd & 3rd, 0 outs
re24_double = calculate_re24('1--', 0, '-23', 0, 0)
print(f"Double (runner 1st to 3rd): +{re24_double:.3f} runs")

# Home run with bases loaded, 1 out -> bases empty, 1 out, 4 runs
re24_grand_slam = calculate_re24('123', 1, '---', 1, 4)
print(f"Grand slam: +{re24_grand_slam:.3f} runs")

# Strikeout with runner on 2nd, 1 out -> runner on 2nd, 2 outs
re24_strikeout = calculate_re24('-2-', 1, '-2-', 2, 0)
print(f"Strikeout (runner on 2nd): {re24_strikeout:.3f} runs")

# Walk with bases loaded, 0 outs -> bases loaded, 0 outs, 1 run
re24_walk = calculate_re24('123', 0, '123', 0, 1)
print(f"Walk (bases loaded): +{re24_walk:.3f} runs")

# Ground ball double play, runner on 1st, 0 outs -> empty, 2 outs
re24_gidp = calculate_re24('1--', 0, '---', 2, 0)
print(f"Double play: {re24_gidp:.3f} runs")
Python
# WPA is calculated from play-by-play data
# pybaseball doesn't directly provide WPA, but we can illustrate the concept

import pandas as pd

# Hypothetical WPA calculation
plate_appearances = pd.DataFrame({
    'PA': [1, 2, 3, 4, 5],
    'Inning': [1, 3, 7, 8, 9],
    'WP_Before': [0.50, 0.48, 0.52, 0.45, 0.38],
    'WP_After': [0.52, 0.46, 0.68, 0.44, 0.41],
    'Event': ['Single', 'Strikeout', 'Home Run', 'Walk', 'Single']
})

plate_appearances['WPA'] = (plate_appearances['WP_After'] -
                            plate_appearances['WP_Before'])

print("Win Probability Added by Plate Appearance:")
print(plate_appearances)
print(f"\nTotal WPA: {plate_appearances['WPA'].sum():.3f}")

# The 7th inning home run added +16% win probability
# That single event contributed significantly to winning the game

5.10 Interactive Sabermetrics Dashboards

While static visualizations are useful for analysis, interactive dashboards enable deeper exploration of sabermetric data. Interactive plots allow users to hover over data points to see exact values, zoom into regions of interest, filter data dynamically, and explore relationships between metrics in real-time. This section demonstrates how to create professional interactive visualizations using Plotly in both R and Python.

5.10.1 Interactive wOBA vs wRC+ Scatter Plot

The relationship between wOBA (weighted on-base average) and wRC+ (weighted runs created plus) is fundamental to understanding offensive production. wOBA measures raw offensive value while wRC+ adjusts for park and league factors. An interactive scatter plot allows us to explore individual player performance, identify outliers, and understand the linear relationship between these metrics.

Python Implementation with Plotly Express:

import pandas as pd
import plotly.express as px
from pybaseball import batting_stats, fg_guts
import pybaseball as pyb

pyb.cache.enable()

# Fetch 2024 qualified batting data
batting_2024 = batting_stats(2024, qual=300)

# Create interactive scatter plot
fig = px.scatter(
    batting_2024,
    x='wOBA',
    y='wRC+',
    hover_name='Name',
    hover_data={
        'wOBA': ':.3f',
        'wRC+': ':.0f',
        'HR': True,
        'BB%': ':.1f',
        'K%': ':.1f',
        'AVG': ':.3f',
        'Team': True
    },
    color='wRC+',
    color_continuous_scale='RdYlGn',
    size='PA',
    size_max=15,
    title='2024 MLB: wOBA vs wRC+ (Qualified Hitters)',
    labels={'wOBA': 'Weighted On-Base Average (wOBA)',
            'wRC+': 'Weighted Runs Created Plus (wRC+)'},
    template='plotly_white'
)

# Add reference lines for league average
fig.add_hline(y=100, line_dash="dash", line_color="gray",
              annotation_text="League Average wRC+ (100)")
fig.add_vline(x=batting_2024['wOBA'].mean(), line_dash="dash",
              line_color="gray", annotation_text="League Avg wOBA")

# Enhance layout
fig.update_layout(
    width=1000,
    height=700,
    font=dict(size=12),
    coloraxis_colorbar=dict(title="wRC+")
)

fig.update_traces(marker=dict(line=dict(width=0.5, color='DarkSlateGray')))

# Show the plot (in Jupyter) or save to HTML
fig.show()
# fig.write_html('woba_wrc_interactive.html')

R Implementation with Plotly:

library(plotly)
library(dplyr)
library(baseballr)

# Fetch FanGraphs leaderboard data
batting_2024 <- fg_batter_leaders(
  startseason = 2024,
  endseason = 2024,
  qual = 300
)

# Create interactive scatter plot
fig <- plot_ly(
  data = batting_2024,
  x = ~wOBA,
  y = ~wRC_plus,
  type = 'scatter',
  mode = 'markers',
  marker = list(
    size = ~PA / 30,
    color = ~wRC_plus,
    colorscale = 'RdYlGn',
    showscale = TRUE,
    line = list(color = 'rgba(50, 50, 50, 0.5)', width = 0.5),
    colorbar = list(title = "wRC+")
  ),
  text = ~paste0(
    "<b>", Name, "</b><br>",
    "Team: ", Team, "<br>",
    "wOBA: ", round(wOBA, 3), "<br>",
    "wRC+: ", wRC_plus, "<br>",
    "HR: ", HR, "<br>",
    "BB%: ", round(BB_percent, 1), "%<br>",
    "K%: ", round(K_percent, 1), "%<br>",
    "AVG: ", round(AVG, 3)
  ),
  hoverinfo = 'text'
) %>%
  layout(
    title = list(
      text = "2024 MLB: wOBA vs wRC+ (Qualified Hitters)",
      font = list(size = 16)
    ),
    xaxis = list(
      title = "Weighted On-Base Average (wOBA)",
      gridcolor = 'rgba(200, 200, 200, 0.5)'
    ),
    yaxis = list(
      title = "Weighted Runs Created Plus (wRC+)",
      gridcolor = 'rgba(200, 200, 200, 0.5)'
    ),
    hovermode = 'closest',
    plot_bgcolor = 'white',
    paper_bgcolor = 'white',
    width = 1000,
    height = 700,
    shapes = list(
      # Horizontal line at wRC+ = 100
      list(
        type = "line",
        x0 = min(batting_2024$wOBA),
        x1 = max(batting_2024$wOBA),
        y0 = 100,
        y1 = 100,
        line = list(dash = "dash", color = "gray", width = 2)
      ),
      # Vertical line at average wOBA
      list(
        type = "line",
        x0 = mean(batting_2024$wOBA),
        x1 = mean(batting_2024$wOBA),
        y0 = min(batting_2024$wRC_plus),
        y1 = max(batting_2024$wRC_plus),
        line = list(dash = "dash", color = "gray", width = 2)
      )
    )
  )

fig
# htmlwidgets::saveWidget(fig, "woba_wrc_interactive.html")

Key Features:


  • Hover Information: Display player name, team, and all relevant statistics

  • Color Gradient: Visual encoding of wRC+ performance from red (poor) to green (excellent)

  • Size Encoding: Bubble size represents plate appearances (sample size)

  • Reference Lines: League average markers help contextualize performance

5.10.2 Interactive WAR Comparison Bar Chart

Wins Above Replacement (WAR) is the most comprehensive single-number metric for player value. An interactive bar chart enables easy comparison across multiple players and quick identification of value distribution across offense, defense, and baserunning.

Python Implementation:

import plotly.graph_objects as go
from plotly.subplots import make_subplots

# Get top WAR leaders
top_war_players = batting_2024.nlargest(20, 'WAR')[
    ['Name', 'WAR', 'Off', 'Def', 'Team', 'PA']
].copy()

# Calculate baserunning component (simplified)
top_war_players['BsR'] = top_war_players['WAR'] - \
                          top_war_players['Off'] - \
                          top_war_players['Def']

# Sort by total WAR
top_war_players = top_war_players.sort_values('WAR', ascending=True)

# Create stacked horizontal bar chart
fig = go.Figure()

# Offensive value
fig.add_trace(go.Bar(
    y=top_war_players['Name'],
    x=top_war_players['Off'],
    name='Offense',
    orientation='h',
    marker=dict(color='#1f77b4'),
    hovertemplate='<b>%{y}</b><br>Offensive: %{x:.1f} WAR<extra></extra>'
))

# Defensive value
fig.add_trace(go.Bar(
    y=top_war_players['Name'],
    x=top_war_players['Def'],
    name='Defense',
    orientation='h',
    marker=dict(color='#ff7f0e'),
    hovertemplate='<b>%{y}</b><br>Defensive: %{x:.1f} WAR<extra></extra>'
))

# Baserunning value
fig.add_trace(go.Bar(
    y=top_war_players['Name'],
    x=top_war_players['BsR'],
    name='Baserunning',
    orientation='h',
    marker=dict(color='#2ca02c'),
    hovertemplate='<b>%{y}</b><br>Baserunning: %{x:.1f} WAR<extra></extra>'
))

# Update layout for stacked bars
fig.update_layout(
    barmode='stack',
    title='2024 MLB WAR Leaders - Value Components',
    xaxis_title='Wins Above Replacement (WAR)',
    yaxis_title='Player',
    hovermode='y unified',
    legend=dict(
        orientation="h",
        yanchor="bottom",
        y=1.02,
        xanchor="right",
        x=1
    ),
    height=800,
    width=1000,
    template='plotly_white',
    font=dict(size=11)
)

fig.show()

R Implementation with ggplotly:

One of R's most powerful features is the ability to convert ggplot2 graphics into interactive Plotly visualizations using ggplotly(). This approach combines ggplot2's elegant grammar of graphics with Plotly's interactivity.

library(ggplot2)
library(plotly)
library(dplyr)
library(tidyr)

# Get top 20 WAR leaders
top_war_players <- batting_2024 %>%
  select(Name, WAR, Off, Def, Team, PA) %>%
  top_n(20, WAR) %>%
  mutate(BsR = WAR - Off - Def) %>%
  arrange(WAR)

# Reshape data for stacked bar chart
war_components <- top_war_players %>%
  pivot_longer(
    cols = c(Off, Def, BsR),
    names_to = "Component",
    values_to = "Value"
  ) %>%
  mutate(
    Name = factor(Name, levels = top_war_players$Name),
    Component = factor(
      Component,
      levels = c("Off", "Def", "BsR"),
      labels = c("Offense", "Defense", "Baserunning")
    )
  )

# Create ggplot
p <- ggplot(war_components, aes(x = Name, y = Value, fill = Component)) +
  geom_bar(stat = "identity", position = "stack") +
  coord_flip() +
  scale_fill_manual(values = c(
    "Offense" = "#1f77b4",
    "Defense" = "#ff7f0e",
    "Baserunning" = "#2ca02c"
  )) +
  labs(
    title = "2024 MLB WAR Leaders - Value Components",
    x = "Player",
    y = "Wins Above Replacement (WAR)",
    fill = "Component"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(size = 14, face = "bold"),
    axis.text = element_text(size = 10),
    legend.position = "top"
  )

# Convert to interactive plotly
fig <- ggplotly(p, tooltip = c("x", "y", "fill")) %>%
  layout(
    hovermode = "y unified",
    height = 800,
    width = 1000
  )

fig

Key Insights from Interactive WAR Charts:


  • Identify players who derive value primarily from offense vs. defense

  • Understand that some elite players (e.g., Aaron Judge) are purely offensive contributors

  • Recognize that defensive specialists can accumulate significant WAR

  • Observe baserunning's smaller but meaningful contribution to overall value

5.10.3 Advanced: Animated Metric Trends

For longitudinal analysis, animated visualizations can show how metrics evolve over time or across player careers.

Python Example - Animated League-Wide wOBA Trends:

import plotly.express as px

# Fetch multiple years of data
years = range(2019, 2025)
all_years_data = []

for year in years:
    try:
        yearly_data = batting_stats(year, qual=200)
        yearly_data['Season'] = year
        all_years_data.append(yearly_data)
    except:
        continue

combined_data = pd.concat(all_years_data, ignore_index=True)

# Create animated scatter plot
fig = px.scatter(
    combined_data,
    x='wOBA',
    y='wRC+',
    animation_frame='Season',
    animation_group='Name',
    hover_name='Name',
    size='PA',
    color='wRC+',
    color_continuous_scale='Viridis',
    range_x=[0.25, 0.45],
    range_y=[40, 200],
    title='MLB wOBA vs wRC+ Evolution (2019-2024)',
    labels={'wOBA': 'wOBA', 'wRC+': 'wRC+'}
)

fig.update_layout(width=1000, height=700)
fig.show()

Benefits of Interactive Sabermetrics Dashboards:

  1. Exploratory Analysis: Users can investigate outliers and patterns without creating new plots
  2. Presentation Quality: Interactive plots engage audiences more effectively than static images
  3. Data Density: Display more information without visual clutter through hover tooltips
  4. Accessibility: HTML outputs can be shared easily via web browsers
  5. Publication Ready: Both Plotly and ggplotly produce publication-quality graphics

Interactive visualizations transform sabermetric analysis from passive observation to active exploration, enabling deeper insights into player performance and metric relationships.


R
library(plotly)
library(dplyr)
library(baseballr)

# Fetch FanGraphs leaderboard data
batting_2024 <- fg_batter_leaders(
  startseason = 2024,
  endseason = 2024,
  qual = 300
)

# Create interactive scatter plot
fig <- plot_ly(
  data = batting_2024,
  x = ~wOBA,
  y = ~wRC_plus,
  type = 'scatter',
  mode = 'markers',
  marker = list(
    size = ~PA / 30,
    color = ~wRC_plus,
    colorscale = 'RdYlGn',
    showscale = TRUE,
    line = list(color = 'rgba(50, 50, 50, 0.5)', width = 0.5),
    colorbar = list(title = "wRC+")
  ),
  text = ~paste0(
    "<b>", Name, "</b><br>",
    "Team: ", Team, "<br>",
    "wOBA: ", round(wOBA, 3), "<br>",
    "wRC+: ", wRC_plus, "<br>",
    "HR: ", HR, "<br>",
    "BB%: ", round(BB_percent, 1), "%<br>",
    "K%: ", round(K_percent, 1), "%<br>",
    "AVG: ", round(AVG, 3)
  ),
  hoverinfo = 'text'
) %>%
  layout(
    title = list(
      text = "2024 MLB: wOBA vs wRC+ (Qualified Hitters)",
      font = list(size = 16)
    ),
    xaxis = list(
      title = "Weighted On-Base Average (wOBA)",
      gridcolor = 'rgba(200, 200, 200, 0.5)'
    ),
    yaxis = list(
      title = "Weighted Runs Created Plus (wRC+)",
      gridcolor = 'rgba(200, 200, 200, 0.5)'
    ),
    hovermode = 'closest',
    plot_bgcolor = 'white',
    paper_bgcolor = 'white',
    width = 1000,
    height = 700,
    shapes = list(
      # Horizontal line at wRC+ = 100
      list(
        type = "line",
        x0 = min(batting_2024$wOBA),
        x1 = max(batting_2024$wOBA),
        y0 = 100,
        y1 = 100,
        line = list(dash = "dash", color = "gray", width = 2)
      ),
      # Vertical line at average wOBA
      list(
        type = "line",
        x0 = mean(batting_2024$wOBA),
        x1 = mean(batting_2024$wOBA),
        y0 = min(batting_2024$wRC_plus),
        y1 = max(batting_2024$wRC_plus),
        line = list(dash = "dash", color = "gray", width = 2)
      )
    )
  )

fig
# htmlwidgets::saveWidget(fig, "woba_wrc_interactive.html")
R
library(ggplot2)
library(plotly)
library(dplyr)
library(tidyr)

# Get top 20 WAR leaders
top_war_players <- batting_2024 %>%
  select(Name, WAR, Off, Def, Team, PA) %>%
  top_n(20, WAR) %>%
  mutate(BsR = WAR - Off - Def) %>%
  arrange(WAR)

# Reshape data for stacked bar chart
war_components <- top_war_players %>%
  pivot_longer(
    cols = c(Off, Def, BsR),
    names_to = "Component",
    values_to = "Value"
  ) %>%
  mutate(
    Name = factor(Name, levels = top_war_players$Name),
    Component = factor(
      Component,
      levels = c("Off", "Def", "BsR"),
      labels = c("Offense", "Defense", "Baserunning")
    )
  )

# Create ggplot
p <- ggplot(war_components, aes(x = Name, y = Value, fill = Component)) +
  geom_bar(stat = "identity", position = "stack") +
  coord_flip() +
  scale_fill_manual(values = c(
    "Offense" = "#1f77b4",
    "Defense" = "#ff7f0e",
    "Baserunning" = "#2ca02c"
  )) +
  labs(
    title = "2024 MLB WAR Leaders - Value Components",
    x = "Player",
    y = "Wins Above Replacement (WAR)",
    fill = "Component"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(size = 14, face = "bold"),
    axis.text = element_text(size = 10),
    legend.position = "top"
  )

# Convert to interactive plotly
fig <- ggplotly(p, tooltip = c("x", "y", "fill")) %>%
  layout(
    hovermode = "y unified",
    height = 800,
    width = 1000
  )

fig
Python
import pandas as pd
import plotly.express as px
from pybaseball import batting_stats, fg_guts
import pybaseball as pyb

pyb.cache.enable()

# Fetch 2024 qualified batting data
batting_2024 = batting_stats(2024, qual=300)

# Create interactive scatter plot
fig = px.scatter(
    batting_2024,
    x='wOBA',
    y='wRC+',
    hover_name='Name',
    hover_data={
        'wOBA': ':.3f',
        'wRC+': ':.0f',
        'HR': True,
        'BB%': ':.1f',
        'K%': ':.1f',
        'AVG': ':.3f',
        'Team': True
    },
    color='wRC+',
    color_continuous_scale='RdYlGn',
    size='PA',
    size_max=15,
    title='2024 MLB: wOBA vs wRC+ (Qualified Hitters)',
    labels={'wOBA': 'Weighted On-Base Average (wOBA)',
            'wRC+': 'Weighted Runs Created Plus (wRC+)'},
    template='plotly_white'
)

# Add reference lines for league average
fig.add_hline(y=100, line_dash="dash", line_color="gray",
              annotation_text="League Average wRC+ (100)")
fig.add_vline(x=batting_2024['wOBA'].mean(), line_dash="dash",
              line_color="gray", annotation_text="League Avg wOBA")

# Enhance layout
fig.update_layout(
    width=1000,
    height=700,
    font=dict(size=12),
    coloraxis_colorbar=dict(title="wRC+")
)

fig.update_traces(marker=dict(line=dict(width=0.5, color='DarkSlateGray')))

# Show the plot (in Jupyter) or save to HTML
fig.show()
# fig.write_html('woba_wrc_interactive.html')
Python
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# Get top WAR leaders
top_war_players = batting_2024.nlargest(20, 'WAR')[
    ['Name', 'WAR', 'Off', 'Def', 'Team', 'PA']
].copy()

# Calculate baserunning component (simplified)
top_war_players['BsR'] = top_war_players['WAR'] - \
                          top_war_players['Off'] - \
                          top_war_players['Def']

# Sort by total WAR
top_war_players = top_war_players.sort_values('WAR', ascending=True)

# Create stacked horizontal bar chart
fig = go.Figure()

# Offensive value
fig.add_trace(go.Bar(
    y=top_war_players['Name'],
    x=top_war_players['Off'],
    name='Offense',
    orientation='h',
    marker=dict(color='#1f77b4'),
    hovertemplate='<b>%{y}</b><br>Offensive: %{x:.1f} WAR<extra></extra>'
))

# Defensive value
fig.add_trace(go.Bar(
    y=top_war_players['Name'],
    x=top_war_players['Def'],
    name='Defense',
    orientation='h',
    marker=dict(color='#ff7f0e'),
    hovertemplate='<b>%{y}</b><br>Defensive: %{x:.1f} WAR<extra></extra>'
))

# Baserunning value
fig.add_trace(go.Bar(
    y=top_war_players['Name'],
    x=top_war_players['BsR'],
    name='Baserunning',
    orientation='h',
    marker=dict(color='#2ca02c'),
    hovertemplate='<b>%{y}</b><br>Baserunning: %{x:.1f} WAR<extra></extra>'
))

# Update layout for stacked bars
fig.update_layout(
    barmode='stack',
    title='2024 MLB WAR Leaders - Value Components',
    xaxis_title='Wins Above Replacement (WAR)',
    yaxis_title='Player',
    hovermode='y unified',
    legend=dict(
        orientation="h",
        yanchor="bottom",
        y=1.02,
        xanchor="right",
        x=1
    ),
    height=800,
    width=1000,
    template='plotly_white',
    font=dict(size=11)
)

fig.show()
Python
import plotly.express as px

# Fetch multiple years of data
years = range(2019, 2025)
all_years_data = []

for year in years:
    try:
        yearly_data = batting_stats(year, qual=200)
        yearly_data['Season'] = year
        all_years_data.append(yearly_data)
    except:
        continue

combined_data = pd.concat(all_years_data, ignore_index=True)

# Create animated scatter plot
fig = px.scatter(
    combined_data,
    x='wOBA',
    y='wRC+',
    animation_frame='Season',
    animation_group='Name',
    hover_name='Name',
    size='PA',
    color='wRC+',
    color_continuous_scale='Viridis',
    range_x=[0.25, 0.45],
    range_y=[40, 200],
    title='MLB wOBA vs wRC+ Evolution (2019-2024)',
    labels={'wOBA': 'wOBA', 'wRC+': 'wRC+'}
)

fig.update_layout(width=1000, height=700)
fig.show()

5.11 Exercises

Exercise 1: Calculate wOBA Manually

Task: Using 2023 data, manually calculate wOBA for a player of your choice and compare it to FanGraphs' published wOBA.

Steps:


  1. Retrieve 2023 FanGraphs Guts data for run values (wBB, w1B, w2B, w3B, wHR, wOBA scale)

  2. Get batting statistics for your chosen player

  3. Calculate singles (H - 2B - 3B - HR)

  4. Calculate wOBA using the formula

  5. Compare to FanGraphs' published value

Python Starter Code:

import pybaseball as pyb
from pybaseball import batting_stats, fg_guts

pyb.cache.enable()

# Get 2023 constants and batting data
guts = fg_guts(2023)
batting = batting_stats(2023, qual=502)

# Extract weights
wBB = guts['wBB'].values[0]
wHBP = guts['wHBP'].values[0]
w1B = guts['w1B'].values[0]
w2B = guts['w2B'].values[0]
w3B = guts['w3B'].values[0]
wHR = guts['wHR'].values[0]

# Choose a player (e.g., "Ronald Acuna")
player_name = "Ronald Acuna"
player = batting[batting['Name'] == player_name].iloc[0]

# Calculate singles
singles = player['H'] - player['2B'] - player['3B'] - player['HR']

# Calculate wOBA
numerator = (wBB * player['BB'] + wHBP * player['HBP'] +
            w1B * singles + w2B * player['2B'] +
            w3B * player['3B'] + wHR * player['HR'])
denominator = player['PA']

woba_manual = numerator / denominator

print(f"Player: {player_name}")
print(f"Manual wOBA: {woba_manual:.3f}")
print(f"FanGraphs wOBA: {player['wOBA']:.3f}")
print(f"Difference: {abs(woba_manual - player['wOBA']):.4f}")

Exercise 2: ERA vs. FIP Analysis

Task: Identify pitchers with the largest ERA-FIP discrepancies in 2023 and investigate why.

Steps:


  1. Get qualified pitchers (100+ IP) from 2023

  2. Calculate ERA - FIP differential

  3. Find the 5 pitchers with ERA much better than FIP (lucky)

  4. Find the 5 pitchers with ERA much worse than FIP (unlucky)

  5. Examine BABIP, LOB%, and HR/FB% to explain the differences

Python Starter Code:

import pybaseball as pyb
from pybaseball import pitching_stats

pyb.cache.enable()

pitching = pitching_stats(2023, qual=100)
pitching['ERA_FIP_diff'] = pitching['ERA'] - pitching['FIP']

# Lucky pitchers (ERA << FIP)
lucky = pitching.nsmallest(5, 'ERA_FIP_diff')[
    ['Name', 'Team', 'IP', 'ERA', 'FIP', 'ERA_FIP_diff', 'BABIP', 'LOB%', 'HR/FB']
]

# Unlucky pitchers (ERA >> FIP)
unlucky = pitching.nlargest(5, 'ERA_FIP_diff')[
    ['Name', 'Team', 'IP', 'ERA', 'FIP', 'ERA_FIP_diff', 'BABIP', 'LOB%', 'HR/FB']
]

print("Lucky Pitchers (ERA much better than FIP):")
print(lucky)
print("\nUnlucky Pitchers (ERA much worse than FIP):")
print(unlucky)

# Analysis questions:
# - What's the average BABIP for lucky vs. unlucky pitchers?
# - Do lucky pitchers have higher LOB%?
# - What does this tell us about future performance?

Exercise 3: Build a WAR Calculator

Task: Create a simplified position player WAR calculator using wRAA, baserunning, and positional adjustments.

Steps:


  1. Calculate batting runs (wRAA) for qualified hitters

  2. Add baserunning runs (BsR)

  3. Add fielding runs (Def from FanGraphs)

  4. Add positional adjustment (approximate)

  5. Add replacement level (~20 runs per 600 PA)

  6. Divide by runs per win (~10)

  7. Compare to FanGraphs WAR

Python Starter Code:

import pybaseball as pyb
from pybaseball import batting_stats

pyb.cache.enable()

batting = batting_stats(2023, qual=502)

# Positional adjustments (runs per 600 PA, approximate)
pos_adj = {
    'C': 12.5, 'SS': 7.5, '2B': 3, 'CF': 2.5, '3B': 2,
    'LF': -7.5, 'RF': -7.5, '1B': -12.5, 'DH': -17.5
}

# Function to calculate WAR
def calculate_war(row):
    # Batting runs
    batting_runs = row['wRAA']

    # Baserunning
    baserunning_runs = row['BsR']

    # Fielding
    fielding_runs = row['Def']

    # Position adjustment (scaled by PA)
    primary_pos = row['Pos'].split('/')[0]  # Get primary position
    pos_runs = pos_adj.get(primary_pos, 0) * (row['PA'] / 600)

    # Replacement level
    repl_runs = 20 * (row['PA'] / 600)

    # Total runs above replacement
    total_runs = batting_runs + baserunning_runs + fielding_runs + pos_runs + repl_runs

    # Runs per win
    war_calculated = total_runs / 10

    return war_calculated

batting['WAR_calculated'] = batting.apply(calculate_war, axis=1)

# Compare top 10
comparison = batting.nlargest(10, 'WAR')[
    ['Name', 'Pos', 'PA', 'WAR', 'WAR_calculated']
]
print("WAR Comparison (Top 10 Players):")
print(comparison)

Exercise 4: Park Factor Investigation

Task: Analyze how Coors Field affects Rockies hitters by comparing home and road statistics.

Steps:


  1. Get Rockies hitters' home/road splits for 2023

  2. Calculate home vs. road differences in AVG, OBP, SLG, HR

  3. Compare to league-wide home/road splits

  4. Quantify Coors' effect

  5. Identify which Rockies hitters benefited most

Conceptual Approach (requires splits data):

# This requires home/road split data
# Conceptual example:

# 1. Get Rockies hitters
# 2. Compare home stats to road stats
# 3. Calculate ratio: (Home_OPS / Road_OPS) for Rockies
# 4. Compare to league average (Home_OPS / Road_OPS)
# 5. Difference shows Coors effect

# Expected findings:
# - League average home/road OPS ratio: ~1.05 (5% home advantage)
# - Rockies home/road OPS ratio: ~1.20-1.25 (20-25% advantage)
# - Extra 15-20% is Coors effect

print("Expected Coors Effect:")
print("- AVG: +.030-.040 points")
print("- OBP: +.025-.035 points")
print("- SLG: +.080-.100 points")
print("- HR: +40-50% increase")

Summary

Traditional sabermetrics transformed baseball analysis by establishing objective, run-based metrics for player evaluation. This chapter covered:

Core Principles:


  • Measuring what matters: runs and wins

  • Contextual adjustments for park, league, and era

  • Distinguishing descriptive, predictive, and projective metrics

Batting Metrics:


  • wOBA: Properly weighted offensive value

  • wRC+: Park and league-adjusted runs created

  • Rate stats and their stabilization rates

Pitching Metrics:


  • FIP: Fielding-independent skill measurement

  • xFIP: Normalized home run rate

  • SIERA: Context-sensitive expected runs

Comprehensive Value:


  • WAR: All-in-one wins above replacement

  • Park factors: Adjusting for ballpark effects

  • Run expectancy and WPA: Situational value

These traditional sabermetrics remain the foundation of modern baseball analysis. In Chapter 6, we'll build on this foundation with cutting-edge Statcast metrics that measure batted ball quality, pitch movement, and defensive positioning in unprecedented detail.

Key Takeaways:

  1. Always consider context (park, league, era) when evaluating players
  2. Use rate stats (wOBA, K%, BB%) for small samples; counting stats (wRAA, WAR) for seasonal value
  3. No single metric is perfect—triangulate with multiple measures
  4. Understand what each metric measures (skill vs. results, descriptive vs. predictive)
  5. WAR is powerful but imprecise—use ±1 win error bars

The next frontier awaits: Statcast data provides pitch-by-pitch tracking that revolutionizes our understanding of player skills.


Further Reading

Essential Resources:

  1. FanGraphs Library: https://library.fangraphs.com/
  • Comprehensive glossary and methodology for all metrics
  • "Getting Started" section for beginners
  1. The Book Blog: http://www.insidethebook.com/
  • Tom Tango's research on run values, wOBA, and WAR
  • Advanced statistical discussions
  1. Baseball Prospectus: https://www.baseballprospectus.com/
  • Alternative metrics (DRC+, DRA, WARP)
  • Sabermetric research articles
  1. "The Book: Playing the Percentages in Baseball" by Tom Tango, Mitchel Lichtman, and Andrew Dolphin
  • Foundational text on run expectancy and linear weights
  1. Baseball Reference: https://www.baseball-reference.com/
  • Historical data and bWAR calculations
  • Play index for customized queries

Academic Papers:

  • Palmer, Pete and John Thorn. The Hidden Game of Baseball (1984)
  • James, Bill. The Bill James Historical Baseball Abstract (1985)
  • Tango, Tom et al. The Book (2007)
Python
import pybaseball as pyb
from pybaseball import batting_stats, fg_guts

pyb.cache.enable()

# Get 2023 constants and batting data
guts = fg_guts(2023)
batting = batting_stats(2023, qual=502)

# Extract weights
wBB = guts['wBB'].values[0]
wHBP = guts['wHBP'].values[0]
w1B = guts['w1B'].values[0]
w2B = guts['w2B'].values[0]
w3B = guts['w3B'].values[0]
wHR = guts['wHR'].values[0]

# Choose a player (e.g., "Ronald Acuna")
player_name = "Ronald Acuna"
player = batting[batting['Name'] == player_name].iloc[0]

# Calculate singles
singles = player['H'] - player['2B'] - player['3B'] - player['HR']

# Calculate wOBA
numerator = (wBB * player['BB'] + wHBP * player['HBP'] +
            w1B * singles + w2B * player['2B'] +
            w3B * player['3B'] + wHR * player['HR'])
denominator = player['PA']

woba_manual = numerator / denominator

print(f"Player: {player_name}")
print(f"Manual wOBA: {woba_manual:.3f}")
print(f"FanGraphs wOBA: {player['wOBA']:.3f}")
print(f"Difference: {abs(woba_manual - player['wOBA']):.4f}")
Python
import pybaseball as pyb
from pybaseball import pitching_stats

pyb.cache.enable()

pitching = pitching_stats(2023, qual=100)
pitching['ERA_FIP_diff'] = pitching['ERA'] - pitching['FIP']

# Lucky pitchers (ERA << FIP)
lucky = pitching.nsmallest(5, 'ERA_FIP_diff')[
    ['Name', 'Team', 'IP', 'ERA', 'FIP', 'ERA_FIP_diff', 'BABIP', 'LOB%', 'HR/FB']
]

# Unlucky pitchers (ERA >> FIP)
unlucky = pitching.nlargest(5, 'ERA_FIP_diff')[
    ['Name', 'Team', 'IP', 'ERA', 'FIP', 'ERA_FIP_diff', 'BABIP', 'LOB%', 'HR/FB']
]

print("Lucky Pitchers (ERA much better than FIP):")
print(lucky)
print("\nUnlucky Pitchers (ERA much worse than FIP):")
print(unlucky)

# Analysis questions:
# - What's the average BABIP for lucky vs. unlucky pitchers?
# - Do lucky pitchers have higher LOB%?
# - What does this tell us about future performance?
Python
import pybaseball as pyb
from pybaseball import batting_stats

pyb.cache.enable()

batting = batting_stats(2023, qual=502)

# Positional adjustments (runs per 600 PA, approximate)
pos_adj = {
    'C': 12.5, 'SS': 7.5, '2B': 3, 'CF': 2.5, '3B': 2,
    'LF': -7.5, 'RF': -7.5, '1B': -12.5, 'DH': -17.5
}

# Function to calculate WAR
def calculate_war(row):
    # Batting runs
    batting_runs = row['wRAA']

    # Baserunning
    baserunning_runs = row['BsR']

    # Fielding
    fielding_runs = row['Def']

    # Position adjustment (scaled by PA)
    primary_pos = row['Pos'].split('/')[0]  # Get primary position
    pos_runs = pos_adj.get(primary_pos, 0) * (row['PA'] / 600)

    # Replacement level
    repl_runs = 20 * (row['PA'] / 600)

    # Total runs above replacement
    total_runs = batting_runs + baserunning_runs + fielding_runs + pos_runs + repl_runs

    # Runs per win
    war_calculated = total_runs / 10

    return war_calculated

batting['WAR_calculated'] = batting.apply(calculate_war, axis=1)

# Compare top 10
comparison = batting.nlargest(10, 'WAR')[
    ['Name', 'Pos', 'PA', 'WAR', 'WAR_calculated']
]
print("WAR Comparison (Top 10 Players):")
print(comparison)
Python
# This requires home/road split data
# Conceptual example:

# 1. Get Rockies hitters
# 2. Compare home stats to road stats
# 3. Calculate ratio: (Home_OPS / Road_OPS) for Rockies
# 4. Compare to league average (Home_OPS / Road_OPS)
# 5. Difference shows Coors effect

# Expected findings:
# - League average home/road OPS ratio: ~1.05 (5% home advantage)
# - Rockies home/road OPS ratio: ~1.20-1.25 (20-25% advantage)
# - Extra 15-20% is Coors effect

print("Expected Coors Effect:")
print("- AVG: +.030-.040 points")
print("- OBP: +.025-.035 points")
print("- SLG: +.080-.100 points")
print("- HR: +40-50% increase")

Chapter Summary

In this chapter, you learned about traditional sabermetrics. Key topics covered:

  • The Philosophy of Sabermetrics
  • Rate Statistics and Denominators
  • Linear Weights
  • Weighted On-Base Average (wOBA)
  • Weighted Runs Created Plus (wRC+)
  • Pitching Statistics