Let's say that we all agree to watch the same 3 games one weekend. Our job will be to keep track of all BIP, and, by IF/OF, say "was that a routine play or not".

Routine would mean that even putting Manny Ramirez ( or Derek Jeter!!.. just kidding you thin-skinned Jeter lovers... ) at SS or CF would make that play.


, we compile our numbers. Let's say that 50% of all balls in play are routine. This will form our basis.

Then, you go to MGL and David Pinto, and say: "hey, I tracked all these balls, and half of them were routine... does your system consider them easy outs?".

MGL or David might respond that: "No... in the case of IF, I can't get the out conversion rate higher than 92% on those specific balls". This to me is a large margin of error. What happens here is that those 50% of the balls are the noise. They tell you absolutely nothing about the fielder, and should be thrown out. This is one reason the year-to-year r is not higher for fielding metrics.

On the other hand, if MGL or David responds: "Yes... because there are other things that I know about those plays, the hand of the batter, the tendency of the pitcher, the hardness of the ball hit, I was able to classify most of those easy outs as easy out... my average out conversion rate on those balls is 96%", then I'm ecstatic! This tells me that, at least on the "noise balls", we can feel comforted that all this noise is not getting in the way of other signals.

This doesn't tells us about our ability to properly evaluate the signals, but at least, we're one step closer.

So, who's up for a little work opening weekend?
I can't guarantee my availability, but I'm excited to see what this project turns up. You're absolutely right, Tango, that weeding out routine plays is essential to a meaningful fielding metric. It would also, I think, adjust for the tendencies of a player's pitching staff and playing surface better than simply factoring in (for example) ground ball percentage.

Great idea!
_tangotiger - Thursday, February 03 2005 @ 10:34 AM EST (#1790) #
Two points to add:

1 - Once your sample size is large enough, say 2000 balls in play (3 years), this doesn't really become an issue. Much anyway. At those levels, pretty much every fielder will have "around" the same number of routine plays, and so the noise evens out, percentage-wise.

While say with 200 BIP, the % of routine plays might be 30 to 70% for any player (though the fielding metrics will assume that they are between 45 and 55%), at 2000 BIP, the % might come in at 48 to 52%, while the fielding metrics thinks they are all 50%.

(All numbers for illustration only).

So, this "problem" is not necessarily a problem when you have alot of years at hand. But, that means you have to wait a long time to get confirmation of what the metric is trying to tell you.

2 - Any fielding metric (ZR, UZR, Pinto, whatever), pretty much has a range of +/- .05 outs per BIP between the best and worst fielders at any position. So, if a metric is classifying all routine plays as "92% converted into outs", when it should really be 99%, well, you see the problem right? You're given the guys with the routine plays an extra +.07 out credit that they don't deserve.

If all fielders are impacted equally, then this is not an issue. However, over a few months or even a year, the percentage of routine plays would not even out, and so, you've got a margin of error here.
_tangotiger - Thursday, February 03 2005 @ 10:55 AM EST (#1791) #
Might as well add another point:

When you look at year-to-year correlations, for UZR, the r is around .50, when the BIP is about 400.

This means that if you have Cameron with 400 BIP in CF (about 100 games), and his UZR is +20 runs, we regress that 50% to get his true talent level at +10 runs (which an uncertainty level of ... I dunno... something like 1 sd = 3 runs).

However, say you have a system that throws out all the routine plays (UZR already does this with infield pop ups, by the way). Now, Cameron has 200 nonroutine plays in 100 games. I would suspect that the year-to-year r might shoot up to .6 at this point. His UZR will still come in at +20 runs (meaning that UZR, in this example, properly evaluated all the routine plays, such that Cameron was not affected).

But, look what happens here... the r is .6 instead of .5. This means we regress Cameron's performance by 40% instead of 50%. Now, his true talent estimate is +12 runs. Furthermore, the uncertainty of that estimate will go down to ... I dunno... 1 sd = 2 runs).

So, this is the real power here.

And, imagine other players who had poorly evaluated routine plays. If Cameron wasn't given enough credit on the routine plays, his UZR might come in at +25 runs instead. Regressing 40%, and now his true talent UZR is +15 runs.


However, it is hard to see how poorly classified balls will change a player's UZR by more than 5 runs.... but, I'd love to be surprised here. In any case, this process will improve the correlation, and reduce the uncertainty level.
Mike Green - Thursday, February 03 2005 @ 11:30 AM EST (#1792) #
That's a fine idea, Tango. It would be great, at the same time, if someone took the objective data, for comparison purposes. The objective data being:

1. hang time and zone, as described in Robert's study in THT, for pop-ups and fly balls,
2. zone and some kind of time measurement (from time of contact to crossing the baseline perhaps?) for ground balls.

For myself, I don't have a concern that difficulties with UZR, at least for shortstop ratings, result from noise. The "noise" groundballs are the ones right at the shortstop, which are turned into outs 90-95% of the time. I am quite sure that UZR, and Pinto, get these right.

The hardest case is the groundball in the 5-6 hole. Measuring opportunity and conversion rate is not obvious at all, as positioning of the third baseman and park factors play a significant role. UZR, from my understanding, looks at all balls in the zone and how many balls in the zone the shortstop converts to outs. In the Larkin study, we considered only at balls in the zone that the third baseman did not get as opportunities for the shortstop.

There are, in my view, problems with either approach. UZR in evaluating shortstops likely does not account for third basemen who get a disproportionate (higher or lower) number of ground balls in the 5-6 zone (due to their superior or inferior range or due to their non-standard positioning). UZR tends to inflate the shortstop's number if the third baseman fields a relatively low number of balls in the 5-6 hole, and to deflate if if the third baseman fields a relatively high number of balls in the 5-6 hole.

By contrast, the pbp method we used tends to inflate the shortstop's evaluation when the third baseman fields a relatively high number of balls in the 5-6 hole, and lower it when the third baseman fields a relatively few number of balls in the 5-6 hole.

It is probably true that a more accurate assessment can be made using a compromise between the UZR approach and the one we used. It would require a careful analysis of third base conversion rates down the line and in the hole. It is difficult to do this using retrosheet data alone.

So, for those of us who are interested in the objective side of this question, it seems to me that the challenge is to define what is routine for a third baseman. I expect that this will require long hours in front of video, or at the park, with a stop-watch- measuring time from contact to the 2nd-3rd baseline, plus 3 zones (at normal 3rd base positioning, left and right).

I hope to do some of this, but it is a long-term project.
_David Pinto - Thursday, February 03 2005 @ 11:39 AM EST (#1793) #
STATS, Inc. has kept this information for years. Every ball is rated 1-4 on how difficult a play it was; 1 means anyone should have fielded it cleanly, 4 it's a fantastic play. Unfortunately, I don't have access to STATS data.
_tangotiger - Thursday, February 03 2005 @ 11:43 AM EST (#1794) #
No question that hang time is the one thing that should be measured, at once. Contact to baseline (or fielder touching). I'd also count the number of times it hits the ground before getting to the baseline or fielder.

And, as usual, mark if the fielder backhanded the play, charged the play, double-pumped, balls slips from his hand, throw in the dirt, pulled the 1B off the bag, etc.... all that stuff.

I can't reiterate enough (maybe I can actually) that the job of the data recorder is to make a record of his observations with as much accuracy and consistency as possible.

This will make the job of the data analyst far easier.
_tangotiger - Thursday, February 03 2005 @ 11:46 AM EST (#1795) #

That's comforting to hear! MGL's been getting STATS data for several years, so I wonder why he's not including it. Maybe he can shed some light if he stops by...
_studes - Thursday, February 03 2005 @ 03:18 PM EST (#1796) #
Tango, count me in.
_Jeff K. - Thursday, February 03 2005 @ 03:29 PM EST (#1797) #
I'll certainly do it, but someone has to post a reminder, or I'm likely to completely forget about it in my excitement over opening weekend.
_tangotiger - Thursday, February 03 2005 @ 03:44 PM EST (#1798) #
I envision a very simple user interface. something like this:

Drop downs for:
team, inning, home/away, batting spot (1 to 9), degreeOfDifficulty(1 to 5), batterResult (out, reach on error, reach on force, reach on hit)

You click submit, and it gets logged on my site (and I'll timestamp it).

I won't do any error checking, and I'll:
- report on the results
- release the logs every week


I'm thinking of "degree of difficulty" as
1: routine play, even for Manny (out 95%+ of the time)
2: requires some effort to get the out (out 75% of the time)
3: 50/50 play
4: requires alot of effort to get the out (out 25% of the time)
5: impossible play, even for Ozzie (out less than 5% of the time)


Now, can we do more? Yes, but that's going to be alot of work. I would prefer to link up the results from here to a PBP outfit. If I can get a limited dump from BIS,, or whoever, then I can link fielders, zones, etc to the "degree of difficulty", and get a far better picture.

If I can't do that, then I'll have to include more dropdowns for fielding position, zone, ground/fly, hang time, runners on base, number of outs.

I mean, there's really no limit to the interface, except:
- the more fields to enter, the less the number people to enter, and the more validation I'd have to do

I'm not too interested in becoming a PBP source.

_MGL - Thursday, February 03 2005 @ 04:30 PM EST (#1799) #
I have been getting STATS PBP data (not all of it) for years. This year, I have access to the raw data (through my MLB team). I don't have any "degree of difficulty" data, only "speed of batted ball (1-3)". I can find out though...
_Jabonoso - Thursday, February 03 2005 @ 05:13 PM EST (#1800) #
This year Caribe series hall of fame inductees had: Willy Mays, Minnie Miņoso, Rod Crew and Juan Navarrete. In the film about Carew, his second base fielding was amazing, with a range par none. His fame and numbers in MLB is that as second base man he fielded a reasonably first base ( no range, not an overall solid contribution).
mention this just to point out that fielding metrics archeology is very much a methaphysical affair.
( if all or part of the above is totally illegible, sorry, but Cerveza Pacifico is very good )
_Lefty - Thursday, February 03 2005 @ 10:33 PM EST (#1801) #
Maybe one of the experts posting on this thread can answer this for me.

Tango uses Cameron for demonstration purposes. As well outlines the critera for degree of difficulty and includes an error as a possible result on the play.

I am wondering why outfielders rarely recieve errors on what appear to me to be a routine plays even for Manny?

I just don't get that. A shortstop on a screamer through the field thumbs the ball and the pitchers burden is relieved but Manny thumbs a fly ball on a trot and its goes against the pitchers record.
_Masticore317 - Friday, February 04 2005 @ 10:24 AM EST (#1802) #
I'd be willing to volunteer my asistance, but I would also need a reminder...
