Yet Another Rookie Receiver Model
Or: What going under the hood taught me about fantasy analytics
By now, everybody and their uncle has developed some sort of predictive model for fantasy football. That I made my own is hardly unique; I’m not even sure my model is meaningfully better than existing ones.
Therein lies the problem, however: it’s difficult to check your work against others’ when people are both modeling different outcomes and using different benchmarks for success. Thus, I decided to build my own Wide Receiver model to predict fantasy production over the first three years of a rookie receiver’s career. The goal was to produce a model that I myself found empirically sound, while also learning something about fantasy analysis along the way.
The Model
I built my model using an NGBoost framework, with decision trees as my base learners. Even for a relatively technical article, though, this is something I don’t want to wade too far into the weeds on, lest we distract from the main takeaways.1
The bottom line is, we want a nonlinear model2 that can capture complex relationships between our inputs (e.g., Dominator Rating) and our dependent variable (PPR points scored). We also want an algorithm can give us a probability distribution, instead of just a single predicted value. This lets us predict ceiling and floor outcomes for players, too.
NGBoost met both these requirements, while also delivering better results than comparable algorithms. It therefore seems to be more than fine for the task at hand.
Model performance
Above is my model’s performance for draft classes considered “out-of-sample” (i.e., data it hadn’t seen yet). It was trained on the 2008 through 2020 draft classes3. Its goal, once again, is to predict PPR scoring over the first three years of a rookie receiver’s career. Devonta Smith, for example, scored 668 total PPR points in that window.
Our model shows an R² score of .5 for the 2021 class and .51 for the 2022 class, which I’d consider relatively decent. By comparison, Playmaker Score—a metric formerly hosted on Football Outsiders—showed an R² of .25. (Granted, the linked article is from a decade ago, and they were also predicting a different dependent variable.)4
The model fares slightly worse in predicting the 2023 and 2024 classes, though that data is obviously incomplete. We are, after all, predicting the first three years of a player’s career, and those players haven’t played for that long yet. Thus, I’m comparing my predictions to a three-year rate based on the players’ production so far.
This has obvious issues, of course: those players could suffer massive injuries, or have late breakouts, each of which could bring our predictions more into line with reality. Yet this ignores the unique nature of the 2023 class, which features many notable outliers. Indeed, the class was full of players who bucked expectations, like Dontayvion Wicks, Pop Douglas, and, of course, Puka Nacua.
Still, while outcome matters, process is equally important. It’s heartening to see, for example, that my model was pretty in-line with Playmaker Score’s 2023 predictions. Despite its relatively low 2023 R² score, my model also had great called shots on Tank Dell and Josh Downs, too, ranking both as top-five prospects despite them being third-rounders. It was also notably bearish on Quentin Johnston, which provides me little solace as a Chargers fan.
Feature Importance
Above are the most important features for our model, in terms of impurity reduction. At the top is a player’s draft spot; it’s by far the most important feature in the model. This jibes with the work done by Zilla Fantasy, and is also borne out by our features’ correlations with PPR, where the relationship between when a player was picked and their three-year outlook is twice as strong as any other feature.
Our other variables come from the ever-useful Pahowdy spreadsheet, which itself sources additional metrics from PFF.5 Outside of knowing a player’s draft spot, his average Dominator rating is the second-most important feature. It tells us what percentage of a team’s production a player commanded, in terms of receiving yards and TD’s.
Our third-most important metric is Best PPG/Age, a Pahowdy metric that takes a player’s best points-per-game in a season, then adjusts for how old they were. Players who had great success at a young age, like Malik Nabers, are rewarded. Conversely, players like Roman Wilson, whose best year came late (and was also underwhelming), are penalized.
Rounding out our list are some features that I found highly predictive in my previous post on the 2025 WR class. This includes Targeted QB Rating and aDOT-adjusted Caught%, both of which I discuss at length in that piece. The former essentially tells us how often a guy makes his QB right, while the latter judges his ability to catch, adjusted for target depth (i.e., deep shots matter more than layups).
2025 Rookie WR’s
Now that we have the technical stuff out of the way, we can finally see what the model thinks of the 2025 receiver class. Our player comps above (scroll to right to see on mobile) are based on the highest- and lowest-producing players, respectively, among the five most similar players. Our ceiling projection is a player’s 80th percentile outcome, per our model, while the floor is their 20th percentile outcome.
Travis Hunter clearly stands out: not only is he the clear best prospect in this class, he’s basically the best prospect since the 2021 class. That class contained Devonta Smith, who had a slightly higher ceiling projection than Hunter. It is worth noting, however, that the “3-Year PPR” column is what our model thinks the median—or 50th percentile—outcome will be for the first three years of a player’s career. For somebody like Hunter, this of course gets to be an inexact science, and the variance contained in his two-way endeavors isn’t baked into my model.
The rest of the prospects, however, jibe pretty well with common assumptions. Tetairoa McMillan, for example, is a relatively “safe” prospect with questions about his ceiling; he has the highest floor of anybody beside Hunter, with a downside comp of Rashod Bateman. Matthew Golden is at the opposite end of the spectrum, boasting star-level upside (and a comp of Garrett Wilson), yet also having a scary low-end comp in notorious bust Corey Coleman.
What’s most interesting, however, are the cases where the model deviates most strongly from consensus. As you can see above, the model really likes Pat Bryant. Compare him to Jalen Royals, however—who was taken about 60 spots later—and you may be confused. Royals boasts a slightly better Dominator rating, as well as a stronger best season in terms of age-adjusted Points per Game.
This would seems to suggest that we may be leaning too hard on draft position. That ignores, however, the fact that Bryant boasts a far better projection than Jaylin Noel, who was taken merely five spots later than him. This is likely because Bryant’s Dominator rating and best age-adjusted PPG season are both better than Noel’s. Bryant, too, has the fifth-highest ceiling of any prospect in this class, even better than many first-rounders, which suggests our model relies on far more than just draft pedigree alone.
The rest of the field offers few surprises, unless you’re really into ranking late-rounders. One interesting quirk is the model liking Kyle Williams much more than the beleaguered Royals and Noel, players who would be considered in the same tier by ADP and most rankings.
The biggest shocker, though, is its bullishness on Isaac TeSlaa, almost certainly driven by his status as an early draft pick. In some sense, I think this is sound, given he’s more likely to see the field than many of his peers. That said, I also think his poor Dominator rating and small-sample concerns are cause to pump the breaks.
Conclusion
Hopefully all this has given sufficient insight into my process, and why I built the model the way I did. It’s worth noting that, at the end of the day, there are still some of my own biases baked into this model, given I’m the one who chose the criteria for its success. Some of the models I tested performed slightly better, for example, on the 2023 and 2024 classes.
Yet I didn’t want to over-index on those draft classes alone, for a couple of reasons. First is the obvious one: those classes simply haven’t played a full three years yet, meaning I’d be basing my model selection on a noisy process (with a lot of projection, too). Second, the subjective quality of these nominally “better” models I created was suspect. Their median projections were often quite low, and often were far too bullish on late-round picks in a way I couldn’t empirically justify.
This ultimately leaves us with a model that, based on my experience as a data scientist, lands in a reasonable middle ground. Most of its predictions are in line with both common sense and other models: I reckon it’s better to be very bullish on Travis Hunter and other first-rounders, for example, than it is to arbitrarily short them.
It also has enough interesting called shots, too, to be considered usefully unique. I don’t think I’ve seen too many other models this high on Pat Bryant, for example, nor have I seen many this bearish on both Jalen Royals and Jaylin Noel. What matters most, however, is that I can explain why my model is ranking those players the way it does, which is arguably more important than any performance boost I could engineer.
Thanks again for reading; you can follow me on Bluesky (@capn-collins.bsky.social) and Twitter (@capn_cc), and feel free to reach out to me on either site, or in the comments below, with any lingering questions.
For those interested, I chose a Laplace distribution for my configuration, mainly due to it performing the best. The only issue I encountered was having to clip lower-bound predictions to zero, which is part of the reason I’m using 20th and 80th percentiles for floor and ceiling respectively. (Too many 10th percentile predictions were zero, which is unrealistic for high-round picks who will almost certainly see a lot of playing time.)
As opposed to something like your classic linear regression model. The benefit of a nonlinear model, in simple terms, is that you can capture more complicated relationships. An example would be receiver size, where it’s generally better to be bigger. A linear model might miss, however, that there’s a point of diminishing returns: at some point, many big receivers are really just misfit tight ends (a la Devin Funchess or Johnny Wilson).
Specifically, I used 2008 through 2016 as my training set, then held out 2017 through 2020 for validation. This is semi-arbitrary, of course, but led to the best performance. In an ideal scenario, I’d like to train more heavily on recent data, given some PFF data only goes back to around 2014.
Note that a Redditor did conduct a more extensive R² score analysis for various prominent models (Playmaker Score included). This comes with an asterisk, since the user excluded some unknown amount of UFA’s, who artificially inflate R² scores. When I dropped UFA’s from my test set (2021 through 2022), I achieved extremely similar results to Pahowdy’s model (R² of .29 and a correlation of .55). This is, of course, unsurprising, given my model is trained on Pahowdy’s data.
Other statistical info, such as our PPR scoring data, comes from Stathead.