Tuesday, November 29, 2011

What FIP Critics Do and Do Not Say

On Fangraphs today, Noah Issacs refuted an argument against FIP.  The argument that Issacs attempts to refute can be formalized as follows:

P1.  If there are players who consistently outperform their FIP, then FIP is a failed stat.
P2.  There are players who consistently outperform their FIP.
C.  Therefore, FIP is a failed stat.

This argument is certainly valid (modus ponens...If A, then B; A; thus B).  So, Issacs argues against the soundness by attacking P1.  He correctly notes that one should expect a certain number of players to consistently outperform their FIP, as there is a 50% chance that any given pitcher will outperform his FIP in any given year.  Hence, there is a 25% chance that any given pitcher will outperform their FIP for two consecutive years, a 12.5% chance he will do so three consecutive years, and so on.

In other words, the presence of players who consistently outperform their FIP is to be expected even over a 5+ year sample size.  That is, FIP's standing as a perfect stat is consistent with the presence of statistical outliers.

Consider the following example.  If there were 100 people flipping a coin, we would expect about 50 of those to flip heads.  Of those 50, we would expect 25 to flip heads the next time, 12-13 to flip heads three times in a row, about 6 to flip heads four times in a row, and about 3 to flip heads five times in a row.  This is all consistent with the fact that there is a 50% chance of flipping heads on any given flip.  With a large enough sample size, you can expect statistical extremes.

Issacs' argument works off the same basic reasoning pattern.  We should EXPECT outliers like Matt Cain who consistently outperform FIP, as our sample size is large enough to reasonably expect extremes.  So, P1 is false.  We can not conclude that FIP is a failed stat just because there are players who consistently outperform their FIP.

The problem with Issacs, refutation, however, is that few critics of FIP make the claim found in P1.  That is, few critics of FIP believe that the mere existence of statisical outliers refutes FIP as an important stat.

Rather, critics of FIP try to offer an explanation for why certain pitchers outperform FIP beyond the expectation of statistical extremes in our sample.  For example, many FIP critics claim that pitchers like Tom Glavine are able to control their BABIP and/or HR/FB ratio through skill, something that FIP assumes is not possible (as FIP assumes BABIP is fully the function of luck).  A FIP critic looks at Glavine, who, after his breakout year in 1991, outperformed his FIP 16 out of 18 years, and offers an explanation for why this is so: Glavine could control his BABIP (career .280) and/or his HR/FB ratio.

A sample argument of this sort may look as follows:

P1.  If pitchers can control their BABIP in a meaningful way, then FIP is a failed stat.
P2.  Pitchers can control their BABIP in a meaningful way.
C.  Therefore, FIP is a failed stat.

This is the sort of argument FIP critics make.  So, this is the sort of argument that FIP defenders need to address.

Issacs' article, while certainly correct in its refutation, is merely attacking a straw man argument.

No comments:

Post a Comment