Monday, May 20, 2019

Normalization Of Statis-Pro Baseball Seasons (Featuring Carded Teams from the 1972 World Series)


On Normalization of Statis-Pro Baseball Seasons

--Featuring the 1972 World Series teams, the Oakland A’s and the Cincinnati Reds

(This is either brilliant or the stupidest idea I’ve ever had with this game.  It took me only forty years to figure this out.  People will either demand more of this or never play it, and I have no idea which outcome will win.)

Typically Carded SPBB Versions, for Comparison:


and here's normalized Versions:


72 Cincinnati, Normalized

72 Oakland, Normalized



I first received a Statis-Pro Baseball season in early 1979, the 1978 carded season.  It was the Bronx-Zoo era Yankees and Red Sox, Jim Rice and Ron Guidry and Rod Carew and Jason Thompson – I was Tiger fan.  The fringe players had the last Mark Fidrych card, a 2-8 wonder off limited data, and it was great fun pitching The Bird in front of Steve Kemp and Lou Whitaker and Alan Trammell.  I had a friend who loved the 1978 Astros and 1978 Dodgers – we practically wore those cards out.

The next baseball year featured the fantastic World Series between the 1979 Orioles- one of the most underrated teams of that era, and the “We Are Family” Stargell-led Pirates.  There was another great National League team, the Montreal Expos, and the Tigers had two new great new pitchers, Jack Morris and Dan Petry. I looked forward to those 1979 cards.

The 1979 season on the surface was wonderful.  The teams were even more vibrant than the 1978 cards.  That was actually the problem.  The 1978 AL ERA was 3.76, but in 1979 the AL ERA was 4.22.  The National League had a smaller differential (3.56 to 3.73) but it was still clear that the season was made up of teams that were on the whole more robust offensively than the season before them.  Jim Rice in 1979 had a similar season IRL to his wondrous 1978 MVP year; yet he was only the fourth best hitter in the 1979 AL.  Fred Lynn, his teammate, had a better Statis-Pro card than 1978 Rice in every way. 

The game was designed around static offensive cards, with the pitchers designed to “average” allowing batting card results half the time.  Within the season pitchers were still rated against each other.  There was no normalization to surrounding seasons or even the other league.  The latter made some sense, the offensive contexts were different between the AL and NL due to the DH.  But it was pretty clear to me a 3.60 ERA in the 1978 AL was rated as a 2-6 pitcher, while the same ERA in 1979 was now a 2-7 pitcher.  This was a big deal, because the game had another flaw, pitchers with 2-6 and 2-5 cards were not properly rated for hits allowed, which required some manipulation in game results (the Shutout Good Stuff PB table) to partially correct.  Simply put, as carded, 1978 was not competitive with 1979.   

These factors particularly vexed me because the only thing better than getting cards from Avalon Hill was making your own teams.  I bought the Sporting News Guide for 1968 (still have it) and rated my own teams using long division and index cards.  The 1968 Tigers were one of my favorite teams but I also carded several others.  Once again, the problem – 1968 pitchers, not matter how low their ERAs, would ‘average’ a PB of 2-6.5, 18/36 chances or 50%.  While their hits per inning were lower, a 2.87 ERA pitcher in 1968 would be a 2-6 but that same ERA would push a 2-8 rating in the 1979 AL.  Meanwhile the Tigers still had the batting cards of a team that hit .235. 

This normalization error was not present for the most part, in Strat-O-Matic, where the batting cards for a low run context would reflect the value of good hitters in that context, while pitchers would also perform as desired.  Within a season, in particular a World Series, Statis-Pro Baseball could give a gamer a great feel, but if you wanted to play across eras, only Strat-O-Matic could really meet the need.

The Legend of David C. LeSueur

There were enough problems with Statis-Pro Baseball that I left playing it for awhile.  In the summer of 1981, the 1980 cardset came out, and while batters cards were still static – a .400 card in 1980 would look the same as one in 1968 – pitchers cards, in particular 2-6 and 2-5 pitchers received a makeover.  A gamer by the name of David C. LeSueur published how to correct the game’s main flaw, the extra hits lower tier pitchers had in early carded seasons, but should not have received.  He did this by calculating the hits, walks, etc of the batters they faced, adjusting for the different chances to arrive on those hitter cards based the pitcher’s PB results.  It was flat-out brilliant writing, one of those articles that you have to read twice because the implications were huge.

What’s more he published this in All-Star Replay, which meant if you received that issue you received the means to calculate for yourself the pitching fixes you needed to at least compute a season with internal consistency.  It was a true game changer, now a pitcher could be what I call a ‘good 2-6’, with low (h+w)/ip but a higher ERA, and he would not be rendered worthless by the game’s carding system.  We all owed a debt to him.   Later, when I found myself on a test team with him, I reached out to thank him for the changes he made that brought me back to looking at Statis-Pro Baseball.

(All Star Replay was a gem – in one issue came LeSueur’s work, 1978 Japanese Baseball cards, Secretariat and Man O’ War race cards for Win, Place and Show, a review of the year’s welterweights including Tommy Hearns, Pipino Cuevas, Roberto Duran, and Sugar Ray Leonard for Title Bout, the 1979-80 NBA semifinal teams for Basketball Strategy, and yet another solitaire system for Paydirt.  This was classic written 1970’s gaming content.)  

But one thing bothered me.  Mr. LeSeuer’s system, as elegant and consistent as it was, still did not solve the problem of the static batters cards.  If there was one thing that Strat-O-Matic taught me, it was a .350 batter’s card in the 1930 AL would look different than one in 1965.  I chewed on that for four decades.  Sometime in the late Eighties Avalon Hill published a Great Teams Set.  This had teams like the 1986 Mets and 1984 Tigers and 1962 Giants.  I looked at this set with keen interest, because the Game Co had re-carded some teams it had already done.  I wanted to see how this might work. Well, what they did was pick teams from very similar ERA contexts.  Then they normalized all the pitchers within the set against each other.  The batter’s cards were still very similar in technical construction, functionally the same as they had been before.  Instantly I knew that this was the wrong track.  If you merely made all pitchers below, say a 2.30 ERA in 1968 2-9s so they could play steroid era teams loaded with 2-5 and 2-6 pitchers, you might get some results which would match up correctly but any internal consistency and balance within a season would be lost. 

Those 1968 batters cards still need to face those 1968 pitchers, and they would now fare even more poorly, while 1997 hitters would crush their lower PB same season pitching cousins.  The historic problem with static batter’s cards would just move to the pitcher’s cards.   Pitching seasons would be stronger with this model, Instead of strong offensive seasons having the edge.  I knew the answer was in remaking the batter’s cards but I wasn’t sure how.  Somehow hits from the pitchers cards needed to move to the batters in 1968, but from the batters to the pitchers in, say 2001.  I thought briefly about using log five to establish a ‘baseline’ offensive season and then remaking the cards for each season to conform to that model. But that still left the pitching problem unsolved.

Then one morning I woke up, after a night of thinking about the 1972 World Series, with most of the answer.  You would not arbitrarily just make a certain ERA a PB across all seasons.  Instead what you would look at is standardizing a set of ratios of (season era * 100 /individual pitcher era).  Typically a ratio of 180 or 190 would be a 2-9 in most seasons, 135-140 a 2-8, 107 or so a 2-7, and 80 a 2-6. A pitcher with an ERA of 1.96 in 1968 (ratio 152) is a 2-8 for this reason, while a 2.20 or 2.30 ERA in a modern high ERA season might be a 2-9.  What you want to find out is- if a pitcher has an ERA of 1.96 in the 1968 AL, how should his ratio compare versus the entire AL from, say 1960-2018 (ERA 4.08)? 

I’m going to adapt Log5:

( p A − p A × p B)/
( p A + p B − 2 × p A × p B )

Where A = 2.98 the ERA for 1968, and 4.08 is the average ERA for all seasons.  The numerator is -9.1784 and the denominator -17.2568, so pAB is now 0.531821.  pB is 0.5 – the average season should yield fifty  percent results on batter’s and pitcher cards.   So pA is 0.531821/0.5 which is a dimensionless correction factor, 1.0637.  A 1968 pitcher should be 1.0637 better than the norm, 1.0.   But what does this mean?  You don’t just divide his ERA by 1.0637.  You must look at those seasonERA/pitcherERA cutoff ratios.

So now what you can do is find the cutoff ratios for 1968 – the last pitcher’s ratio in each bucket – and divide by 1.0637 – and that would be the new ratio for that rating.    For instance, Stan Bahnsen’s 2-8 lowest 1968 ERA ratio of 145.4 is corrected (divided by 1.0637) to the new ratio of 135.7 admits upon comparison four new pitchers to the ranks of the 2-8s. And so it goes, until you have calculated the full season. You may need to adjust the ratios a bit for the teams to work within the season, but in the end, the 1968 AL will feature two more 2-6 pitchers, nine more 2-7 pitchers and six more 2-8 pitchers out of 112 total pitchers.  The 2-5 range now starts at an ERA of 3.86 rather than 3.69.  50 percent of all pitchers are now 2-7s or better in this model. We are almost home.


The average 1968 AL pitcher is now a much closer to a 2-7 than if his card were exactly midway between 2-6 and 2-7.  He will have results on his cards about 55.4% of the time, which means all the 1968 batter hits, especially the extra base hits, now need to be bumped up on the batter’s cards by the factor 0.5/ (1-0.554) or 1.121.  Batter’s cards only see results 44.6% of the time, so the hitters must have more singles, doubles, and homeruns than before.  The pitchers now contribute their 11 strikeouts 55.4 percent of the time, so the amount on the batter’s cards needs to reflect this.  Same with walks:

1968 AL STD:                         1968 AL NEW:
1B:  8.79                                    1B: 9.86
2B: 4.10                                      2B: 4.60
3B: 0.74                                      3B: 0.83
HR: 2.42                                      HR: 2.72
K:   10.10                                     K; 10.0
W:  2.52                                      W: 1.98
HBP: 0.93                                    HBP: 1.05

This doesn’t look like much of a change, but this bump adds an extra 13 points of BA, 12 points of OBP, and 30 points of SLG.   1968 is the worst case scenario among teams from the last sixty years.  Most adjustments are much more subtle, more like rounding up or down here or there.  The 1972 AL happens to be the second worst season for offense in the last sixty years.  The 1972 NL is still a strong pitching season, but not as strong comparatively.

That’s adjustments to an average 1968 AL hitter.  But that includes pitchers.  The better the hitter you are, the more this adjustment can help you, which matters if you are, say, Willie Horton.  After correction his homerun numbers rightfully rise to nine homeruns and a carded .307 batting average – Horton was the fourth best hitter in the league among qualifiers at .285.  No longer does the card just reflect hits/ TOT.  It now reflects how he hit in his offensive context.  Hitting 36 HR in 1968, he would hit 41 in a normalized season in 143 games, and 43 against (as an example) 2006 AL pitching in 140 games.   

You can test this, too. Let’s take a cross section of 1968 Tigers.  If we look at their OPS plus from 1968, and then compare what their carded OPS is in terms of OPS+ for seasons that are more normal, like 1977, 1982 and 2011, we should see a similar projected OPS+.  And we do.  Horton would probably project to a 155 or so in a modern season, Northrup 130, Cash maybe a tad low at 127, Kaline, a similar outcome.   None of that troubles me too much, because these results have to be ‘rounded’ into 1/64 values on the batter’s cards.  Some players will round higher, such as Matchick does.  The point is Jim Northrups’ projection from a .264 hitter in a year where the AL batted .230 to a .295 hitter in a more modern AL is perfectly reasonable.  Teams in 1977 (.266), 1982 (.264), and 2011 (.258) hit from 28 to 36 points higher than 1968, and Jim Northrup was 16th among qualifiers in batting average in 1968.  The equivalent hitters in 1977, 1982, and 2011 were Larry Hisle (.302), Fred Lynn (.299), and Derek Jeter (.297) respectively. (These are larger seasons with DHs but you get the rough idea.)   


 
(It always bugged me that the 2006 Detroit team, carded with Marcus Thames and his nine home run numbers in the static system, was a better slugging team on their cards than the 1968 Tigers.  That result is ludicrous.  1968 Detroit hit 185 homeruns in a season where the team average was 110; they out-homered the second best team by fifty two homeruns. That was, at the time, a top twenty team homerun hitting result in baseball history, 185 home runs- in the Year of the Pitcher.  The 2006 team had 203 home runs in a season where the average team hit 182. Their correction, 0.9471 reflects the fact the batters now get their results 52.8 pct of the time. Once you factor both corrections the 1968 team pulls ahead on pure power, as it should.)

The last piece is those extra hits hitters have (17.99 to 16.05) now have to come from the pitcher’s cards, which they will through the standard LeSueur handling.  In this case, this handling will remove about two hits off the average pitcher.  Not all pitchers are affected the same way – Denny McLain’s card is the exact same in both models, while lesser pitchers, such as Joe Sparma will lose a few hits to account for the fact the average batting card slugs higher. This is as it should be- 1968 pitchers are not ranked against the rest of AL history as an absolute- their PB, or ability to muzzle extra base hits is ranked on a sliding scale. The 1968 AL pitchers are on the balance the best and most extreme example, but a 1987 AL pitcher with a 3.70 ERA will still be a 2-7 based on his relative ratio within his season.  The 1968 Pitcher with a 3.70 ERA is a 2-6, and will consequently face more batter results either when facing his own season or any other within the rating pool.  His card must be adjusted to account for this.          

This is probably tough for some readers to fully grasp – I’ll have to do more explaining. But good cards are worth a dozen paragraphs, and so I’ll give you some examples- the normalized cardings for the 1972 World Series.  Both teams, of course are pre-DH teams, Oakland from a year with a 3.06 ERA, Cincinnati for a year with a 3.45 ERA.  Carded in static fashion Cincinnati would have equivalent pitching and much better hitting, which means they would be a much better team, which was certainly not the case.  Adjusted for their contexts in this new system Cincinnati is a good pitching team with a slightly corrected lineup (factor 1.063), while Oakland has a slightly better staff with a lineup that benefits from even stronger correction (factor 1.108).  Now the Series is a dead heat, with the top of Cincinnati’s mighty lineup squaring off against Oakland’s depth.  The natural use case for Normalized teams is exactly this type of matchup, a series across leagues or eras.

Even within a season, the new model teams should play each other well.  While it’s true 1968 hitters will be stronger within their season, so are the pitchers they face.  The model introduces an interesting type of pitcher to the usual mix, the brilliant 2-6; hurlers like Mickey Lolich who have good enough stuff to prevent batters from getting on base better than most 2-7 pitchers, but have trouble on occasion handling extra base hits.  Lolich allowed 23 Homeruns in 230 IP.  A similar NL pitcher would be a guy like the Cubs’ Bill Hands.     

One last note – while this normalization approach I think makes Statis-Pro Baseball a better game than it was in forty years ago in 1979, and it’s a good extension of David C. LeSeuer’s model, I’m not going to claim it can match Strat-O-Matic yet.  No way.  For one thing, I still think most people like the game for within league and season replays, and my guess is the extra work needed to make these cards for that type of replay is too much effort for most gamers.  The real value of such a normalization method only becomes evident once you have many seasons created using this method that can be interplayed.  That takes a lot of effort and time to make work. It's either brilliant or it's crap, and I'll let you be the judge.

Secondly, there are some things about Statis-Pro’s pitching feel and ease of play that appeal to me – but Hal, Steve, et.al., research righty/ lefty matchups and ballpark stats, complex fielding, items probably beyond the scope of a simpler game like SPBB.  The fielding model is better in SOMBB.  The guys at Strat-O-Matic can flat out make some fantastic baseball seasons, and that’s a great game. 

PS- 1972 is the next full Statis-Pro Baseball season.