Sunday, February 4, 2024

The (Second to the) Last of Three (no, Four) parts on the Team Interception Feature in Strat-O-Matic Pro Football.

 The second to the Last of Three oops four parts on the Team Interception Feature in Strat-O-Matic Pro Football.

 


 The real-life 1970 Lions scored 4 TDs on 28 interceptions.  Is this better or worse than expected?


SOM PRO FOOTBALL LINKS

“Sometimes, the season itself is the outlier”

 –Scott Everngam

 In the previous installments, we explored the design intent of the feature, and the challenges in evaluating the raw interception return data a season might produce. We’ve shown the feature is pretty good at distributing the most return yards to the most threatening teams; in as few as six replays real trends start to appear that mirror real life team results.  We’ve show it is an improvement over the stock interception chart.  But we haven’t tackled the final objection- why is it a season like 1975, which had 533 interceptions but only 25 interceptions returned for TD, or 1977, which tallied 562 interceptions but only 28 returned for TD will come out consistently high when using this feature? 

Well to look at this, I created a graph of every season’s Real Data from 1956 to 1981, effectively bracketing these years, plotting the season’s overall interception return yardage average and pct of overall interceptions returned for TDs.  And while the fit is not perfect, it is not exactly a random walk, either.  At the season level, the best fit for TD pct vs return average over the range of data we see is a second order power law, implying a fairly strong push upward in “house pct” as the average yardage per return increases.  And here it is:

Figure 1: Season Level Real Life Data for Average vs TD PCT, 1956-1981



This is all real NFL Seasonal data compiled from each team’s data underneath it, except for the 1975 replay data point the original chart data points in red.  The original chart was developed in 1968 for the game’s initial release and very much is a product of its time; it reflects the five seasons in the mid-Sixties that surround it.   1975 and 1977 in real life were below the curve of expected TDs their yardage would have normally produced, with 1975 in particular about 1.8 pct lower than predicted for its 14.5 yards per return.   Over 533 interceptions this would produce a deviation of 8-10 interceptions versus a typical such season over NFL history.  The effect in 1977 is less pronounced but probably still about 7 interceptions low.  The Fifties teams and quite a few mid Seventies teams are low outliers, while the mid to Late Sixties trend high.  Sometimes the real data from the season itself is the outlier, as Scott observed once while creating CMs.

It’s a pat answer but I haven’t explained why this is true. It’s a heck of a lot of data, if this were a normal distribution 26 seasons and 13,095 interceptions should produce a pretty nice chart, not the messy correlation we see above.  But the overall data set is neither continuous nor normal; the y axis response is low frequency binomial (yes/no) data, and each season is a collection of subsets of team data that lie beneath its sums.  If you look at the team data over the same period, you see it has some very interesting properties:  

Figure 2: Team Level Data, Aggregated by Range


There are no seasons that sum out as low as the lowest team ranges or as high as the highest team ranges.  Seasons are populations of teams. At the team level both the TD percentage and the percentage of 0 TD returns in the sample range are strongly correlated to the average yards per return. The latter case is in the inverse, as teams with interception return averages less than 12.0 will fail to get even one TD more than half the time.  It’s important to note that even real-life teams that had very high return averages might fail to tally even one TD in one out of eight (or so) cases.

Another way to look at it is 18 pct of teams had 17 yds or greater per return; these teams accounted for 25 percent of the total yards and 30 percent of the total touchdowns during this interval.  Six out of seven of them with have at least one TD, as opposed to two thirds of the total.

I keep harping on these high average teams because they help to explain why certain higher average return seasons have such high variability.  Season such as 1975, which averaged in the mid 14 yards per return range do not have every team as a high performer.  Instead, a higher seasonal average is more an indicator that there are also some high potential teams mixed in with the rest of a typical population, rather than a guarantee of high return touchdown percentages.  If we return to the original chart, we see three teams had similar overall return avg stats but different TD percentages (again real-life data):

Figure 3: Team Level Data for three NFL seasons in the mid 14s for average yard per return

  

That’s quite a range of TDs and House Pcts from a narrower range of inputs, but actually all three are possible based on the underlying data, which is decidedly not continuous nor normally distributed.  There is no guarantee that a team with a high return average will meet its potential and no guarantee that a team won't uncork a TD when the rest of the real-life teams with similar averages mostly didn't. 

Looking at 1975 we see that other than Baltimore that key team population at/or/over 17 yards a return is not carrying its weight in TDs, with four teams out of the top eight not getting any at all.  That number should be one out of eight when considering NFL History, so this type of distribution will happen about 1/80 times. Unlikely, but not impossible.

Figure 4: Team Level Data for 1975


Now we look at the middle season, 1971.  Led by Houston the top teams get 12 return TDs in 115 returns.  Washington, which is in the next range subpopulation down, also chips in a nice year:

Figure 5: Team Level Data for 1971 


And, finally, the high field, 1966.  Here the top teams are robust and are also supported by good seasons from Miami, Buffalo, and LA.  This outcome is also merely unlikely, but not impossible:

Figure 6: Team Level Data for 1966  


In summation, some corrections on team charts can be made to try to tame high observed interception return TDs. But there are limits on how far you can go, and to how much these adjustments will translate in seasons with higher return averages. Throughout NFL history, it can be shown that there is a strong correlation between team interception return average and the percent of these interceptions returned for touchdowns.  And the goal of the feature is to closely match relative return yardages within a season, if fed the same number of interceptions per team. 

It is also possible that the season itself is an outlier when its combined team population's results are compared to all teams in NFL History.  One must consider what the predicted outcome is likely to be, not just a deterministic idea of what a statistic should be based only on a value, and not within its context.

Note: the original questions on the Forum concerned 1977, which is why I mentioned it.  But, unfortunately, I don’t own 1977 to test in detail, so I chose 1975 to illustrate the point.  Teams like 1975, 1977, 1979 and 1981 are all somewhat analogous, while the mid to late Sixties are also analogous on the high side.

Fred Bobberts

Original Publication Date: 2/4/2024

No comments:

Post a Comment