Tuesday, August 21, 2012

Suspected Blogger on Blogger Plagiarism

This a pretty simple and straight-forward post, wherein I would like to get people's thoughts on whether they suspect this is plagiarism or merely coincidence.  I will give my conclusion at the end. (Seems like a good place for a conclusion, right?)

On August 6th, Russell Carleton wrote a piece for Baseball Prospectus detailing how a manager's job is much more than a mere mathematical undertaking.  Rather, managers are required to constantly deal with the ever-beloved human element of baseball.  One point Carleton makes regards how difficult late-inning relief decisions can be, especially when dealing with the psyches of various players  Here is his passage in full:

"It's the ninth inning, and you're up by one. Your top two relievers are Smith and Jones, and both are fresh and available, which is great, because you're in the thick of a tight pennant race and need this game. Smith is generally better than Jones and usually gets the call here. But there's a complication today. Smith has a daughter who has a chronic medical issue. He's a private man and doesn't discuss this with the press, because he wants to keep his family out of the limelight. (Can you blame him?) He got some bad news about his daughter earlier and has been walking around with his head down all day. You've seen him like this before. He'll say he's okay, but he can't concentrate, and his performance suffers to the point where Jones would actually be the better pitcher tonight to nail down that lead."

On August 21st, Josh Worn wrote a piece for the Detroit Free Press (8/22 update: apparently the Free Press has deleted the column) detailing how Jim Leyland's job is much more difficult than the standard fan assumes it to be.  In that piece, Worn gives an example of how difficult a late-inning relief decision (this one between Joaquin Benoit and Jose Valverde) can be.  Here is his passage in full:

"It's the ninth inning, and you're up by one run. Your top two right-handed relievers are Benoit and Valverde, and both are fresh and available. Benoit is the eighth inning guy, and Valverde is the ninth inning guy. They’ve been in their roles all year. Benoit has been plagued by a case of “homeritis” since the All-Star break. You really don’t want to use him, you really don’t, but Valverde has a brother who has a chronic medical issue. (Please note that I have no idea if Valverde has a brother with a chronic medical condition. So, it wouldn’t be a good idea to spread a rumor like that.) He's very private and doesn't bring this up with the press because he wants to keep his family out of the news. He got some bad news about his brother the night before and has been walking around with his head down all day. You've seen him like this before. He'll say he's OK, but he can't concentrate, and he’ll go out and walk four batters and give up a couple doubles in the gap. Basically Benoit would actually be more likely to nail down that lead."

 Let's look first at the exact word-for-word similarities.  Both articles featuring the following exact phrases:
  • "fresh and available"
  • daughter/brother "who has a chronic medical" issue/condition
  • "He got some bad news about his" daughter/brother
  • "and has been walking around with his head down all day"
  • "You've seen him like this before.  He'll say he's [okay/OK], but he can't concentrate"
  • "to nail down that lead"
Next, let's look at nearly word-for-word similarities.  Both articles featuring the following, similar phrases:
  • a mention of it being the 9th inning with one team down one
  • a reference to the team's two top relievers, and the manager having to choose between them
  • a claim that the pitcher with the sick relative is a private person who would prefer to keep the issue out of the media, specifically for the sake of his family
It also must be noted that these similarities appear in the same order in both articles.  That is, there is no need to jump around to find the similarities.  They track in a corresponding, one-to-one relationship throughout both articles.

I brought this up to Worn, who made assurances that he had not read the article from Carleton.  He acknowledged that the two articles were certainly similar, but claimed that he did not plagiarize it in any way.

In philosophy, we use what is known as Inference to the Best Explanation in order to form beliefs in situations such as this one.  Basically, we look at the evidence in front of us and ask, "What's the best explanation of this evidence?"  In this way, we treat the belief formation process like a detective treats a crime scene investigation.

So, here's the evidence we have in front of us: two extremely similar passages, containing extremely similar words and phrases, used in the exact same order.

Given this evidence, I believe that the reasonable conclusion to draw is that plagiarism has occurred.  The "coincidence explanation," which posits that the similarities between the two posts are merely a series of coincidences, does not appear to be the best explanation.  To determine this, merely ask yourself, "Which is more likely?  (1) That these two pieces were written independently and coincidentally contained nine phrases/sentences, placed in the same order, attempting to make the same point?  Or (2) that plagiarism occurred?"

Now, perhaps some people know Worn personally and have extra evidence to add to their reasoning.  For example, if someone has known Worn for a long-time, has great evidence in support of his integrity, and has a justified belief that he would never plagiarize, then these pieces of evidence would factor into their reasoning process.  Perhaps they would be strong enough to allow the "coincidence explanation" to overtake the "plagiarism explanation" in terms of the likeliness of each being true.

As it stands now, lacking that evidence, I believe that the best explanation is plagiarism.

Though I certainly hope I am wrong.

(8/23 update: Though the Detroit Free Press story has been deleted, a Google search of the URL shows that it certainly once existed: http://tinyurl.com/9cypshd)

Tuesday, August 14, 2012

The Easiest Decision the Tigers have in 2013

This is my first post in a while. I wish my comeback post would be a revolutionary opus that would change the way humanfolk view the world.  Instead, I am arguing for something that should seem self-evident to a reasonable thinker (though hosts and callers on 97.1 The Ticket disagree). Simple language time: Jhonny Peralta's option should be picked up for 2013.

JHondo is owed $6MM with a 500k buyout, so the Tigers should consider the option a $5.5MM option.  Before we even consider what $5.5MM should buy you, let's consider what you can expect out of Peralta. JHondo is currently sitting at a 100 wRC+ (which is perfectly league average). His fWAR is tied for 7th among ALL MLB SS.  JHondo's career stats match what one would hope...100 wRC+ and 101 OPS+.  In an ideal scenario, JHondo is among the top 5-8 SS in the MLB.  At worst, he is an above average SS.

So, worst case scenario, would you pay $5.5MM for an above average SS? Of course you would!

Now, on the open market, what can you expect out of $5.5MM? Well, even if you buy only one win with those millions, you would still be considered a winner.  Peralta meanwhile is averaging almost 3 wins per year since becoming a Tiger.  At that rate, the $5.5MM option is even more of a no-brainer than previously expected.

There is no grand revelation at this end of this post. I wish there were. But, simply put, if you did not think Peralta's 2013 option should be picked up, I hope you do now.  If you still think Peralta SUCKS, you may have contracted stat STD's from an unreasonable person, causing you to think RBI, Runs, Wins, and RISP stats are of the utmost importance. If so, consult a doctor. Preferably a sabre-friendly one.

Tuesday, January 3, 2012

Why the Rangers Can’t Lowball Yu Darvish

Last night Kevin Goldstein briefly responded to several questions that basically asked, “Why don’t the Rangers just lowball Yu Darvish since they are the only MLB team with which he can negotiate?”  Goldstein’s succinct (and accurate) reply was that doing so would be “dumb.” 

There are many reasons doing so would be dumb.  For example, the Rangers would risk alienating a premier player and hurting their relationship with future Japanese imports (as well as American free agents).

One important reason that is often ignored is that Darvish was making significant money in Japan.  We in America tend to operate on this assumption that Japanese baseball players do not make much money, but this claim is certainly false as it pertains to Darvish.  He has earned the following salaries over the past five years (converted from Yen to USD using today’s exchange rates):

  •                 2011: $6.4MM
  •                 2010: $4.2MM
  •                 2009: $3.5MM
  •                 2008: $2.6MM
  •                 2007: $0.92MM

Now, if Darvish and the Rangers cannot agree to a deal, he will become a free agent after the 2013 season.  Given his historical increases in salary, Darvish is likely to make about $8-9MM in 2012 and $10-12MM in 2013.  Using the low end of the estimates, he can expect to make $18MM over the next two years.  After that, he would become a true free agent.  There would be no posting fee given to the Nippon Ham Fighters, and he could negotiate with all 30 teams.  That’s a quite enviable spot for a 27-year-old to be in.

So, for those who believe that the Rangers should lowball Darvish, how low do you mean?  Darvish is two years away from a $100MM contract anyways.  In the meantime, he will make at least $18MM.  How low could a lowball offer go?  If the Rangers offered Darvish $80MM over five years, he would easily pass.  He would likely prefer a 5/100 deal for 2014 and beyond (a fairly low estimate for Darvish on the open market), combined with his expected salary over the next two years, which would give him $78MM over the next five years, with two guaranteed years for another $40MM.

Simply put, Darvish has much more leverage than most assume, because he makes significant money in Japan.  Plus, he is two years away from true free agency where his agent can play large bids off of each other.  If the Rangers do not offer $100MM (and perhaps significantly more), Darvish has good reason to pass up the offer.

So, yes, Goldstein is right.  Attempting to lowball Darvish is definitely “dumb.”

Thursday, December 8, 2011

Will Jim Leyland Use Octavio Dotel Properly?

It looks like the Tigers are about to consummate (with full carnal knowledge) a deal with Octavio Dotel .  According to the fine work done by Jon Paul Morosi, Enrique Rojas, and Danny Knobler, it appears to be a one year deal with a vesting option (most likely based off of 2012 appearances) for a second year.

Since I am not yet sure of the price (or the years) involved in this deal, I am not ready to declare it a good/bad/neutral signing.  However, I am sure of this much:

Octavio Dotel should be a right-on-right guy exclusively.

In 2011, Dotel yielded a .198 OBP and a .211 SLG to righties.  Against lefties, he surrendered a .345 OBP and .500 SLG.  In 2010, Dotel handled righties to the tune of .245 OBP and .331 SLG, while lefties threw up a .331 OBP and .517 SLG.  2008: .297 OBP, .355 SLG vs righties; and .422 OBP .577 SLG vs lefties.

So, in short, Doc Oc Dotel is a premier relief option vs righties and a downright liability vs lefties.  So, he profiles as a right-on-right guy who never has to face lefties.  Because of this, he will pair up nicely with Daniel Schlereth (the much maligned Tigers reliever who is quite adept as shutting down lefties [seriously, check his splits]).

Most bullpens could not afford to have a guy whose only job was to get out righties in the 6th or 7th inning, but the Tigers are blessed to have a clear 8th inning and 9th inning guy (and by "blessed" I mean that they brutally overpaid for each of those spots).  Because of this, the Tigers can afford to dedicate a roster spot to Dotel.  Furthermore, the presence of Phil Coke and Al Alburquerque (aka 'Padre K'), who can each get out both lefties and righties also allows for a Dotel-type.

Here’s hoping Jimmy Leyland understands Dotel’s strengths (and clear limitations), and uses him where he can excel – against right-handed hitters (remember him dominating righties throughout the 2011 playoffs?...to quote my buddy…"Dotel made a mockery of Braun in the playoffs”), and almost never against lefties.

After seeing Jimmy misuse Schlereth and Pauley in 2011, however, I fear Jimmy will overuse Dotel, negating much of his value.  So, here’s an open request to Jim Leyland…


Wednesday, December 7, 2011

How are the Marlins Going to Pay for All These Players?

The Miami Marlins have now signed their third premier free agent in Mark Buehrle.  Along with inking Jose Reyes and Heath Bell, the Marlins have committed 191 million dollars to new free agents.   
This raises an important question:

How in the world do the Marlins intend to pay for all these guys?

Consider the Marlins’ payroll the last 12 years:
  • 2011: $ 57,695,000
  • 2010: $ 47,429,719
  • 2009: $ 36,834,000
  • 2008: $ 21,811,500
  • 2007: $ 30,507,000
  • 2006: $ 14,998,500
  • 2005: $ 60,408,834
  • 2004: $ 42,143,042
  • 2003: $ 45,050,000
  • 2002: $ 41,979,917
  • 2001: $ 35,762,500
  • 2000: $ 19,900,000
For 2012, the Marlins payroll is already projected to be 80-85 million.  But even that is a bit misleading, as Jose Reyes’ deal is heavily backloaded.  His deal breakdowns by year as follows:
  • 2012: 10 mill
  • 2013: 10 mill
  • 2014: 16 mill
  • 2015: 22 mill
  • 2016: 22 mill
  • 2017: 22 mill
  • 2018: 22 mill option or 4 mill buyout
Given the many signings for a team who historically has never given out too many large deals, and given the backloaded nature of Reyes’ deal, it seems clear that the Marlins are expecting massive revenue increase in the upcoming years.  This belief probably derives largely from the new stadium that the Marlins are opening in 2012. 

But, the history of sports in Florida has shown that Florida residents (for a whole host of reasons) do not tend to offer much financial support for their professional sports teams, even when they are winning.

The Rays' struggles with attendance and fan support are widely recognized, and the Rays have been a great team for three years.

Furthermore, even when the Marlins won it all in 2003, they averaged less than 20,000 fans per game.  The following year?   More of the same.

So, the Marlins are expecting massive revenue growth, but it seems unlikely to occur.  Sounds like a repeat of 1996-1997, when, even after winning the World Series, the Marlins could not keep their team together.

If anyone believes that there is good reason to expect massive revenue growth, I would love to hear it.  Because as it stands now, I am not sure where that growth will come from, other than the vague hope that Miami fans (particularly Latin American ones) will start coming out in droves.   

Yet, there is little in Marlin history (or even Florida sports history in general) to support this hope.

Tuesday, December 6, 2011

Introducing a New Pitching Stat: ppERA (Part 1 of 5) - Motivation

(This is Part 1 of a five part series.  Part 2 will be an explanation of ppERA.  Part 3 will feature examples.  Part 4 will discuss the benefits of ppERA.  And Part 5 will consider objections and offer replies.)

The flaws with traditional ways of measuring a pitcher’s performance (such as Wins, ERA, Saves, etc.) have been exposed through decades of sabermetric analysis.  In the place of these stats, sabermetricians offer stats such as the following:

  • FIP (Fielding Independent Pitching):  Focuses on that which a pitcher has most control of – home runs, strikeouts, and walks.  This removes that which a pitcher largely lacks control of – whether batted balls fall for hits or not
  • xFIP:  FIP with an adjustment to stabilize HR/FB rate, which studies show is not something that seems to be within a pitcher’s control
  • tERA (True ERA):  Basically, tERA calculates the runs a pitcher is expected to give up over the number of outs he is expected to get.  This is done by figuring out the run and out expectancy for each PA ending event (K, BB, HBP, HR, Line Drive, Outfield Fly Ball, Groundball, Infield Fly Ball).

There are, of course, many more advanced stats that attempt to evaluate pitcher performance.  Each stat, however, shares an important common characteristic – each one calculates only actions that end plate appearances.  Anything that happens before the final pitch of a plate appearance is ignored.

This is a strange result.

Justin Verlander threw 3,941 pitches last year.  2,485 of those pitches did NOT end a plate appearance.  That’s 63.1% of the pitches he threw.  Even for a stat like tERA, which accounts for all PA-ending events, 63.1% of the pitches Verlander threw were irrelevant for the evaluation of his performance.  And for a stat like FIP, even less of his pitches (8.48%) mattered.

Intuitively, it seems that, on average, a groundball in an 0-2 count will be more weakly hit than a groundball in a 2-0 count (and thus results in a hit less often).  If we ignore everything that happens before the groundball, we have no way of accounting for this.

Furthermore, sabermetricians correctly attempt to remove irrelevant context.  Just as a stat like ERA incorrectly rewards/punishes pitchers for Strand Rate and team defense, every advanced stat rewards/punishes pitchers for a strike or a ball happening to occur when there were already two strikes or three balls, respectively. 

In other words, sabermetricians view each pitcher-hitter confrontation as an isolated event, correctly ignoring irrelevant context such as whether there are men on base.  It is my contention that this ignoring should be extended even further.  Each PITCH should be treated as an isolated confrontation between pitcher and hitter, and count should be disregarded.  (I understand that this is a controversial claim.  I will consider objections and offer replies in Part 5 of this series).

This is my motivation for the introduction of Pitch-by-Pitch ERA or ppERA, which is really just a variation of tERA.  ppERA, however, will count every pitch that leaves a pitcher’s hand. 

It will not be my contention that ppERA should be the only stat used for evaluating pitching performance.  Rather, I believe it should accompany the other (many) stats one considers when evaluating pitchers.

Tuesday, November 29, 2011

What FIP Critics Do and Do Not Say

On Fangraphs today, Noah Issacs refuted an argument against FIP.  The argument that Issacs attempts to refute can be formalized as follows:

P1.  If there are players who consistently outperform their FIP, then FIP is a failed stat.
P2.  There are players who consistently outperform their FIP.
C.  Therefore, FIP is a failed stat.

This argument is certainly valid (modus ponens...If A, then B; A; thus B).  So, Issacs argues against the soundness by attacking P1.  He correctly notes that one should expect a certain number of players to consistently outperform their FIP, as there is a 50% chance that any given pitcher will outperform his FIP in any given year.  Hence, there is a 25% chance that any given pitcher will outperform their FIP for two consecutive years, a 12.5% chance he will do so three consecutive years, and so on.

In other words, the presence of players who consistently outperform their FIP is to be expected even over a 5+ year sample size.  That is, FIP's standing as a perfect stat is consistent with the presence of statistical outliers.

Consider the following example.  If there were 100 people flipping a coin, we would expect about 50 of those to flip heads.  Of those 50, we would expect 25 to flip heads the next time, 12-13 to flip heads three times in a row, about 6 to flip heads four times in a row, and about 3 to flip heads five times in a row.  This is all consistent with the fact that there is a 50% chance of flipping heads on any given flip.  With a large enough sample size, you can expect statistical extremes.

Issacs' argument works off the same basic reasoning pattern.  We should EXPECT outliers like Matt Cain who consistently outperform FIP, as our sample size is large enough to reasonably expect extremes.  So, P1 is false.  We can not conclude that FIP is a failed stat just because there are players who consistently outperform their FIP.

The problem with Issacs, refutation, however, is that few critics of FIP make the claim found in P1.  That is, few critics of FIP believe that the mere existence of statisical outliers refutes FIP as an important stat.

Rather, critics of FIP try to offer an explanation for why certain pitchers outperform FIP beyond the expectation of statistical extremes in our sample.  For example, many FIP critics claim that pitchers like Tom Glavine are able to control their BABIP and/or HR/FB ratio through skill, something that FIP assumes is not possible (as FIP assumes BABIP is fully the function of luck).  A FIP critic looks at Glavine, who, after his breakout year in 1991, outperformed his FIP 16 out of 18 years, and offers an explanation for why this is so: Glavine could control his BABIP (career .280) and/or his HR/FB ratio.

A sample argument of this sort may look as follows:

P1.  If pitchers can control their BABIP in a meaningful way, then FIP is a failed stat.
P2.  Pitchers can control their BABIP in a meaningful way.
C.  Therefore, FIP is a failed stat.

This is the sort of argument FIP critics make.  So, this is the sort of argument that FIP defenders need to address.

Issacs' article, while certainly correct in its refutation, is merely attacking a straw man argument.