And does it influence how the Cardinals construct their roster?
When thinking up content to develop for my weekly article (which is easier said than done, BTW), I often find myself drifting towards challenging a popular narrative to see if it can be supported with data (or not). Are the Cardinas really good at draft-and-develop? Yes, the data supports the designation of “elite”. Is draft-and-develop the best way to go? Sort of. It’s not the highest odds pathway to team success, though. Is there a year where it makes sense to go “all in”? Maybe overpay that one Free Agent because the window is open? No. High payroll is only very loosely correlated with winning, especially if you are not the Yankees. And so on.
Once I select a topic, then the challenge is to figure out a way to obtain and parse data to see what it tells. That is not easy. I took statistics in college a long, long time ago. And didn’t do all that great. Professor Tierney did me a solid by passing me. It was the class that stood between me and graduation from Maryville. Topics like regression, sampling, bootstrapping, confidence intervals were hard then and even harder now, 40 years later. Of course, I wish I’d seen the relevance of all that dense math and I’d paid more attention to my studies. Now they tell me….! By the way, what the heck is Chi Squared? It is the only thing I remember from that class.
Today (actually for the last several months), I’ve been building a mental model to challenge a narrative that suggests that the Cardinals have an inherent advantage playing in the NL Central. That the NL Central isn’t as tough as the other divisions and playing a lopsided number of games within the division provides a boost. Or however you’d like to phrase the narrative. I think I’ve described it enough that you know what I’m talking about.
I pondered a bit on exactly how I would support (or refute) such a narrative. It got a bit easier with my research project that compared talent acquisition styles (drafting, IFAs, trading and free agent spending). Some of the data I could parse and use for this project started to surface during that piece of work. Don’t ask me how. Sometimes it’s as simple as running the wrong query and getting an interesting result that doesn’t help what you are researching, so you save the code and file it away for another data. And then spend hours searching for where I saved it.
The method
I ultimately landed on a pretty simple (I think it’s simple) model for how I’d do that. First, re-used my dataset for team performance for the seasons 2000-2024. A nice odd 25 seasons. I made it a nice even 24 seasons by removing 2020. I captured the fWAR for each “place” (1st thru 5th) a team finished in their division, under the hypothesis that fWAR for a season finished was a pretty good proxy for the talent of that team, for that year. I also captured the actual wins for each “place” a team finished. That was easy. It’s called the standings. What began to emerge are a couple of things.
I started comparing the difference between fWAR and actual wins for each place teams could finish. Remembering that I view fWAR as a measure of talent, actuals wins becomes the measure of actual performance and that difference I was calculating would surface variables that could artificially “boost” a teams realized performance beyond their true talent. For example, an 88 win true talent team (measured in fWAR) might win 94 games playing in the AL Central in 2024, since there was an unbelievably poor team available to inflate the better teams record. See where I’m going here? I called this difference between fWAR and actual wins the “win variance over WAR”. Catchy, huh?
Observation – All top teams over-perform their WAR. NL Central teams do this more
As I compiled all this, I started seeing some interesting trends. Here is a graph that plots the average variance by division rank, within division. For example, in the NL Central, the top ranked team (on average) out-performs their WAR by ~6 actual wins. The bottom ranked team under-performs by almost 2 wins. I probably used a bad example, because the NL Central had six teams for a portion of the analysis period. They don’t anymore.
Here we can observe that 1st and 2nd place teams almost always have actual wins that over-perform their fWAR (true talent). Shocking, I know. This over-performance is much more pronounced in some divisions than others (such as the NL Central). But all division winners get a boost where they overplay their fWAR by 2-6 wins (on average).
Next, I see that last place teams have the inverse phenomenon. In all divisions, the last place team under-performs their fWAR by 1-3 wins.
Observation – The NL Central teams tend to lag everyone else in accumulated talent (ie. WAR)
Next, I went back and plotted the average WAR for each place in each division.
For whatever reason, the NL Central lags the other NL divisions in typical fWAR by 3-6 WAR. The AL has this same variance between divisions, where the AL East outpaces the others by about 5 WAR. The overall baseline skews higher in the AL. I guess fWAR is biased towards the AL talent! Really, the most obvious explanation is that the relative differences in revenue and payroll create a clustering of talent, with a bit less in the middle divisions and more in the coastal divisions. If I had a trustworthy data source on revenue, I’d map that into this data.
No matter which division, the winning teams overperform their fWAR by at least 2 wins. Losing teams under-perform their fWAR. Middle of the road teams play at their fWAR. This is true across at least 5 if not all 6 of the divisions. Every first-place team enjoys this “over performance”, which I suspect can be attributed to fWAR under-estimating the win impact of some baseball outcomes (luck?). It takes talent (measured in fWAR), plus some mysterious ingredients not measured in fWAR that allow a team to over-perform and win a division. Likewise, I notice that the bad teams always under-perform their WAR. This makes me suspect that all the bits-and-pieces of WAR overvalue certain actions that don’t contribute to winning as much as expected.
Observation – The gap between first and second place in the NL Central is noticeably closer than other divisions
Another observation about the NL Central. The #2 team in the division is (on average) right there with them on talent (WAR). All of the other divisions have more of a spread in talent between #1 and #2. Peculiarly, the #2 ranked teams do not see quite the bump in over-performance that the #1 team does. I wonder why?
Observation – The NL Central is not the exclusive home to the worst teams in the league
The outlier bad teams tend to float around the divisions, but the data suggests they cluster a bit in the AL West, AL Central and NL Central. That data say the AL East has the most talent at the bottom, but the Orioles in the powerful AL East are such a recent example of a really bad team over the years. Think of the Marlins in the NL East. Not too distant in the past are the powerful Astros, who were the poor stepchildren of the AL West for a while. Let’s not forget the contemporary version – the Chicago White Sox of the AL Central.
An Observation about the Cardinals – they almost always over-perform
The Cardinals regularly over-perform their WAR. Even in the 2023 season (by a wee bit), and almost always by 4 or more. Only twice in 25 years have they under-performed their WAR. In 2008, by 1.6 WAR and in 2012 by 4.4 WAR. Sidenote: I’ve done a number of data deep dives for different projects and one constant is that fWAR tends to under-estimate the Cardinals’ wins, even in the backcast. I have no idea why (except for a previous theory posted on how fWAR under-values the Cardinals because it values things that Busch III suppresses harshly such as HRs and Ks, even though it suppresses them for both teams). Remember this when projections come out.
So, is there an advantage?
Overall, it appears that being in the NL Central allows an additional 4 wins over expected, on average (if you are already a good team). In the NL Central, a good team can expect to out-perform their WAR to a greater degree than other teams in other divisions, implying they don’t have to be so aggressive in making that one last addition in talent (or dollars).
Interestingly, the laggards of the NL Central tend to have a little higher WAR and under-perform less than their contemporaries in the other divisions. This implies that it’s not so much an advantage gained by beating up on lesser teams due to the uneven schedule, but more around the notion that the NL Central is historically lower in talent than other divisions overall. This is really the graphic support behind the observation that the worst teams move around and don’t just live in the NL Central.
But is there really an advantage? I say Yes, and no. Yes, when they are good, they get an extra ~4 wins by virtue of being good in the NL Central. No, in that this advantage only applies when they are good and applies almost equally to any other team that is good. So having talent aboard is a pre-requisite to realize this advantage, and it doesn’t translate into a cake walk for winning the division.
But…
Context is everything. You COULD say the Cardinals have a built-in advantage playing in the NL Central. In the context of, if you plucked the Cardinals out of the Central and moved them to the East, they’d have a tougher time competing. I think that is a fair observation to make.
That said, notice the relative payroll positions for teams in each division. The NL Central and AL Central both have noticeably lower payroll points, which I assume to be a reasonable proxy for lower revenue. To that end, one could just as equally suggest that the Yankees, Red Sox and others have an advantage in their division because their revenue (or payroll) is so much larger than the rest of the group, whereas the gap in the NL Central (and even moreso in the AL Central) is much smaller and one could conclude the divisions should be more competitive, if a bit less talented. Advantage? Disadvantage? Hard to really say.
A last thought … the question of strategy
I often try to track the thinking of Cardinals management. When I find myself thinking (to myself) “why don’t they do <this>, it’s so obvious…”, I work to avoid thinking I know more than they do and I focus my concentration more on trying to figure out what they see, that I don’t.
One of those kinds of moments comes up virtually every year, when they end up seeming to construct a roster destined to win 85-86 games. Not quite enough. And many of us ask, why don’t they just make that investment in one more player? Can’t they see what we see? Well, now I’m wondering. If, in fact, they see the same things we do, but do they factor in this “over performance” factor and come to see their team as one that can over-perform 85-86 projected wins and be closer to 90 actual wins and make it into the playoff, just with what they have. Really then, as a businessperson, why would they make that last (low correlation) investment, when the performance boost is more likely to help than hinder them? Just a thought…Not saying they would be right but offering how they can look at the same thing and see different (potential) outcomes.
The closing
One thing that pops out of this data is about Cardinal’s talent acquisition strategy. I think it has been obvious for a while now that the Cardinal’s calibrate their talent acquisition around how they view their local competition. Not criticizing, this seems like a reasonable thing to do. In recent years, they haven’t judged that competition very well, particularly the Brewers. Of course, I’ve spent some of my moments wondering how the Brewers keep plugging along, too. Maybe another research project…Before I do that, I’ll leave this tease of a graph behind (if comparing to the Cardinals, beware the Y-scale difference).