NFL Rookie Running Back Model - Using Advanced Model for RBs

Our goal is to identify the top rookie prospects based on data points that correlate most with future NFL production.

If you want a complete breakdown of how the model works, check out the Super Model Inputs & Methodology below the table.

Last Updated Aug 19th, 2024 8:18 EDT

With the NFL Draft quickly approaching, we are releasing our updated Fantasy Life Rookie Super Model!

Our goal is to identify the top rookie prospects based on data points that correlate most with future NFL production. I have been working on NFL rookie models for the last three years, and over that time, I have studied and measured hundreds of predraft variables against future NFL production.

The truth is that most variables don’t carry a strong signal, or they overlap too much with an existing variable to make it into a model. Even once you define a list of relatively strong inputs, it is hard to accurately predict which college athletes will be the best NFL players.

Football is a sport with countless dependencies played by notoriously unpredictable creatures known as human beings. When you add in plain old variance, you can see how this activity can become challenging. But that is what makes it so interesting, and it fuels me to test new ideas every offseason.

So, without further ado. Let’s dive into the inputs used for the 2024 WR Super Model.

RB Super Model Inputs & Methodology

The inputs below are in order of their correlation to fantasy production in an RB’s first two years in the NFL.

  • Projected draft capital (NFL Mock Draft Database)
  • Collegiate program quality
  • Adjusted career yards per team attempt (rushing and receiving)
  • Career composite PFF grades (rushing and receiving)
  • NFL.com prospect grades
  • Speed Score
  • Career TDs per game
  • Age

Because the model includes advanced data that wasn’t widely available before the 2017 class, our sample focuses on prospects with at least two years of play since then. So, our correlations to future performance currently derive from RB data from 2017 to 2022.

Super Model Note: the only RBs included from the 2017 class left for the NFL after three years because we don’t have data for the 2013 season to cover four-year starters from the class.

For all production stats, the data comes from the game log level rather than the season.

Draft Capital Value

The model uses Chase Stuart’s Draft Value Chart for draft capital, essentially a better version of what many know as the Jimmy Johnson trade chart. The value of a draft pick isn’t linear, and this methodology helps us capture that. The dropoff in value is steeper in the first round and becomes much flatter around the end of the second round. Draft capital value is the most weighted input in the Super Model.
 

Program Quality Index

Program quality is a pedigree metric that uses draft capital value to determine the total value each collegiate program has contributed to the NFL Draft since 2015. The model uses a composite score derived from two inputs.

  • Program draft capital at the RB position
  • Program draft capital at the RB, WR and TE positions

Those scores are then indexed to form the Program Quality Index.

Prospects who come from more robust programs score better. Program quality has been a factor in the model before, but this is a better way of quantifying it. Additionally, this metric helps offset lower production numbers from prospects with more competition.

The weighting for pedigree is intentionally lower than the correlation to future production suggests because program quality creates a double-counting effect for draft capital. While we want prospects from schools that churn out high draft picks, that particular player’s draft capital is included in program quality when we look back at the model. This is also why we use a program quality score focused on RB, WR and TE in the RB model.

I want to shout out to Billy Elder, who spawned this idea.

Adjusted Career Yards Per Team Attempt Index

Adjusted career YPTA is a production metric that allows us to normalize yards based on the team environment, which is essential because team volume varies from one situation to the next. A prospect averaging 75 yards per game in a low-volume offense might be better than another averaging 100 yards on a high-volume squad per-team-attempt basis.

Receiving yards are worth twice as much as rushing yards in this equation. This gives us a better approximation of value versus half and full-PPR formats.

Equation: (rushing yards + receiving yards*2) / (team rushing attempts + passing attempts)

Career Composite PFF Grade Index

This qualitative metric is based on a player’s career PFF Rushing Grade and PFF Receiving Grade. If you wonder how PFF Grades work, I recommend reading Steve Palazzolo’s breakdown. But below are two excerpts that can get you by if you just want the basics.

“Credit is given for each move the running back makes to add value to the play, whether forcing a missed tackle, using speed to gain the edge or creating yards through contact.” 

“Our goal is to isolate the running back’s contribution to that production, and the runners with the highest grades are those who produce above expectation and outside what the run blocking or scheme allows.”

PFF Grades account for context we otherwise can’t capture at such a massive scale. Because of this, it isn’t surprising that grades correlate more strongly to future production than individual statistics, such as missed tackles forced, yards after contact, and explosive plays. Plus, it allows us to concisely present that information in one data point.

Super Model Note: We are calculating the career grades based on season grades weighted by rushing attempts and passing targets.

NFL.com Prospect Grade Index

This is another qualitative metric based on Lance Zierlien’s prospect grades on NFL.com. His prospect scores have a 0.59 correlation to Year 1 and Year 2 RB fantasy production since 2017, which was strong enough to add a film element to the Super Model. 

The grades are indexed on a scale of 0 to 1.

Speed Score

I tested all NFL Combine and pro day data, including RAS (relative athletic scores) for all positions. While most athletic tests show some signal, they aren’t strong enough to make it into the model. However, for RB, Speed Score garnered a 0.31 correlation to future production and offered relatively low overlap with the other data points in the model.

Speed Score combines a player’s weight with 40-yard dash time (weight*200)/(40-time^4), offering a significantly stronger signal over 40 times alone. Bill Barnwell of ESPN created Speed Score.

Career Total TDs Per Game

This is another production metric. The data showed that using a normalized metric like YPTA was superior for receiving yards, but that wasn’t true for TDs. Instead, per-game data demonstrated a stronger correlation than share, per-team attempt, and other options.

There is a correlation between yards and TDs, so once again, there is some overlap in signal between our metrics—however, not all RBs who are strong in YPTA score a lot of TDs.

Age Index

A player’s age derives from how old they will be at the beginning of the upcoming NFL season. Historically, younger players and early-declares carry a stronger signal than older prospects.