Batter's Box Interactive Magazine Batter's Box Interactive Magazine Batter's Box Interactive Magazine
This article results from the joint efforts of Jonny German and Mike Green. It began with Mike's Hall watch series on shortstops and the search for more reliable objective measures of Barry Larkin's defence than were otherwise available.

Barry Larkin had a reputation as a good but not great defender during his prime. We decided to check whether his reputation was well-earned by doing a play by play analysis of his defence during 1991, his age 27 season. For the piece, we have relied on information contained in the events files at time we are looking at Larkin's ability to turn ground balls into outs.

The Data

The Retrosheet event file for Cincinnati's 1991 season contains a pitch by pitch and play by play account of every game of the season. Players, umpires, weather conditions and even noteworthy radio calls of unusual plays are recorded. It is a fabulous resource. Most importantly for us, it contains the location of every batted ball, using the Project Scoresheet Hit location diagram. The infield portion of the diagram is reproduced below; the full diagram can be seen here.

The events file records a play using the notation:"standard numerical account/hit location"
To understand the way it works, here are a couple of examples. If a hitter grounds out on a ball hit directly at the shortstop, the events file will record "63/G6". The first half "63" means that it is an ordinary 6-3 play, fielded by the shortstop who throws on to first for the putout. The second half "G6" means that it is a ground ball fielded in the "6" zone in the diagram above.

If the hitter grounds out to the shortstop on a ball up the middle on the shortstop side of the bag, the events file will record "63/G6M". If the hitter grounds out to the shortstop on a ball in the hole, the events file will record "63/G56". A ground single through the hole will be recorded as "S7/G56D".

One important point about the hit location diagram concerns the evaluation of third and first basemen. The hit location diagram does not distinguish between ground balls directly at the third and first basemen and those closer to the line. All ground ball outs to the third basemen, whether a routine ground ball directly at him or a diving stop of the shot down the line, are recorded as 53/G5. We'll come back to that point later.

The Method for evaluating Barry Larkin

To evaluate Barry Larkin's groundball out conversion efficiency for 1991, we counted plays made and total opportunites for Larkin, other Cincinnati shortstops and the opposition in 3 areas: up the middle, in the hole and at him. We then calculated the expected number of plays made for Larkin in each area, by multiplying his opportunities by the opposition's conversion rate in each area. This generates a park-neutral expectation for Larkin.

We consider the difference between his actual number of plays made and the expected number
to be a good estimate of his groundball conversion efficiency. We then considered the effect of third base and second base positioning.

Here are the basic results:
                  Larkin     Other Cin. shortstops      Opposition

Up the Middle
Plays Made 131 40 137
Opportunities 181 67 206
Conv. Rate 72.4% 59.7% 66.5%

In the hole
Plays Made 16 5 23
Opportunities 59 18 92
Conv. rate 27.1% 27.8% 25.0%

At him
Plays Made 165 66 236
Opportunities 175 75 250
Conv. rate 94.3% 88.0% 94.4%

Total Conv.        75.2%              69.4%                 72.3%

3rd base .880 1.316 1.009
2nd base .775 .806 .786

The numbers are self-explanatory, save for the positioning. To calculate the third base positioning number, we used the following ratio: hits down the third base line (S7 or D7/G5) + third baseman conversion of ground balls in the hole (53/G56) divided by outs down the third base line (53/G5) + ground balls through the hole (S7/G56). The higher the number, the likelier it is that the third baseman is playing closer to the line. The reader is encouraged to review the hit location diagram above for the codes.

Similarily for second baseman, we used the ratio: ground ball hits between first and second (S9/G34) + outs on grounders up the middle on the second base side of the bag (43/G4M) divided by 4-3 putouts on ground balls between first and second (43/G34)+ ground singles up the middle (S8/G4M). The higher the 2nd baseman's ratio, the likelier it is that he is playing closer to the first base line.

Our interpretation

In his 119 games played in 1991, Barry Larkin converted 12 more ground balls into outs than would have been expected, 11 of them on balls up the middle and 1 on balls in the hole. The calculation for expected conversion of balls up the middle would be 66.5% (Opposition conversion rate) X 181 (Larkin's opportunities) = 120. Larkin made 131 plays, so recorded 11 more conversions than expected.

Larkin converted 1 more ball in the hole into an out than would be expected. The calculation for expected conversion of balls in the hole would be 25% (Opposition conversion rate) X 59 (Larkin's opportunities) = 15. Larkin converted 16 of his opportunities. Larkin performed precisely as expected for balls at him: 94.4% (Opposition conversion rate) X 175 opportunities = 165 expected conversions. Larkin recorded 165 conversions.

This is a conservative view of Larkin's efficiency, due to the positioning of his neighbouring infielders. His third baseman, Chris Sabo, seems to have played close to the line when Larkin was in the game and off the line when a replacement was in the game. A similar, but much smaller, effect is noted for his second baseman Bill Doran. Larkin recorded three 6-3 putouts on groundballs on the second base side of the bag; the opposition recorded none.

We were reluctant to make adjustments to reflect the positioning of Sabo, despite the large differences between his ratio and the opposition third basemen's ratio. As mentioned earlier, the hit location diagram does not distinguish between ground balls directly at the third baseman and those down the line. Consequently, we were reluctant to infer that Larkin's efficiency on ground balls in the hole was better than it appears due to Sabo's positioning. This point could certainly be argued.

Comparison with another evaluation method

One common approach to evaluating shortstop's defence is to simply look at assists/9IP. I thought that it might be worthwhile to compare results. Larkin had 372 assists in 1,032 IP (IP information courtesy of the Stats 1992 Major League Handbook). In the league, shortstops had 5,797 assists in 17,387 innings. Pro-rating the league average over 1,032 IP would result in a league-average figure of 344 assists, meaning that Larkin had 28 more assists than expected, using this method. Not all assists result from groundballs to the shortstop- double plays with the shortstop at the bag, relays and rundowns all can result in shortstop assists. This approach also takes no account of park factors (turf vs. grass, for instance), or of the pitching staff, such as K rates, whereas by comparing Larkin with his opposition we are obtaining a park-neutral and essentially pitching-staff-neutral result. We will come back to this topic after the double play efficiency review.

We noticed some other points from our data. The conversion rate for both Larkin and the opposition on balls directly at the shortstop was over 94%, higher than we expected. The conversion rate in the hole was roughly 25%, perhaps a little lower than expected. At least for shortstops, the hit location, rather than batted ball speed, does seem to be paramount in ground ball out conversion rates.

We also noted that both Larkin and opposition shortstops had more than twice as many opportunities up the middle than in the hole. Larkin had more than 3 times as many. This does lead us to think again about the relative importance of speed/quickness and arm strength in shortstop defence. It also raises questions about the reasons for the differing proportions of the opportunities for Larkin and opposition shortstops. Did Larkin get more opportunities up the middle relative to the opposition because of positioning of other neighbouring infielders or perhaps because of the nature of the Cincinnati pitching staff or because of their poor fielding ability or for some other reason?

In our next installment, we will look at Larkin's 1991 double play efficiency.

"The information used here was obtained free of
charge from and is copyrighted by Retrosheet. Interested
parties may contact Retrosheet at 20 Sunset Rd.,
Newark, DE 19711."
Barry Larkin's 1991 defensive efficiency- groundballs- a play by play analysis | 10 comments | Create New Account
The following comments are owned by whomever posted them. This site is not responsible for what they say.
_Magpie - Wednesday, January 26 2005 @ 02:04 PM EST (#2031) #
We also noted that both Larkin and opposition shortstops had more than twice as many opportunities up the middle than in the hole.

Wouldn't you expect that?

You'd probably have to carry out this same process for the other three infielders to get a true picture (and please! not on my account, anyway! this is fascinating enough already!), but aren't there simply many more balls hit up the middle?

I think you'd expect shortstops to get to more balls up the middle - everybody has a little more range to their left - that's where the glove is, and I just have the vague notion that it's a more natural movement. Maybe for that reason.

And to make more plays that way as well, because it's a shorter throw...

Random memory, the thoughts of two corner infielders.

Keith Herndandez saying "No one in this league hits the ball right down the line. Why would I play there?"

Graig Nettles saying "I hate guarding the line. MOst balls are hit up the middle. If I get way off the line, the shortstop can move over..."

Something like that, anyway.
Mike Green - Wednesday, January 26 2005 @ 02:20 PM EST (#2032) #
It was no surprise that the conversion rate was higher for balls up the middle than for those in the hole. But, 2 or 3 times as many opportunities for a shortstop up the middle compared with in the hole? Opportunities means not only balls the shortstop gets his glove on, but also ones that get past.

A cause for the difference is likely the superior ability of a third baseman to get the ball in the hole, compared with the average pitcher's ability to snag the ball up the middle.

Anyways, when we discuss the relative merits of Russ Adams and Aaron Hill as defensive players, for instance, it's useful to know how often opportunities occur in both places.
Gerry - Wednesday, January 26 2005 @ 04:56 PM EST (#2033) #
Excellent work guys. Very interesting. The difference in outs intuitively seems low, as your alternative method shows. I would expect a premium shortstop, if in fact Larkin was that, to make 30 or so more plays than an average shortstop.
Mike Green - Wednesday, January 26 2005 @ 05:40 PM EST (#2034) #
Thanks, Gerry. We'll come back to Larkin's overall package after the DPs.

It is certainly possible that a premium shortstop could make 30 or so more plays than an average shortstop over a 150 game season. That would equate to 23 Runs Saved Above Average, which is a lot. Anyways, Larkin only played 119 games in 1991, so you'd expect the figure might be a little lower than 30, if he was indeed a premium defensive shortstop.
_tangotiger - Friday, January 28 2005 @ 12:44 AM EST (#2035) #
Yes, fabulous work! And great presentation too. I've got to re-read it a few times.
Mike Green - Friday, January 28 2005 @ 09:52 AM EST (#2036) #
I would like to correct an error. The 3b positioning ratios- .880 when Larkin in the game, 1.316 when other shortstops were in the game and 1.009 for the opposition, suggest that Sabo was playing further off the line, rather than closer to the line, when Larkin was in the game. This could suggest that Larkin's decreased opportunities in the hole, and perhaps some his increased efficiency up the middle, resulted from Sabo's positioning.
_MGL - Monday, January 31 2005 @ 03:51 PM EST (#2037) #
Nice job and excellent presentation. A couple of questions. Why do you say that comparing Larkin's numbers to his opposition creates a pitcher-neutral situation? Do you mean Larkin to his teammates? Of course, because of sample size issues, using his teammates' numbers as any kind of baseline is worthless. Also, although, as you say, using the opposition as the baseline creates a park-neutral situation, because of sample size issues, don't you think you might be better off using league-wide numbers as the baseline and either not having any park adjustments or caluclating a park adjustment separately and then applying it? My 15 years of research doing exactly what you are doing (UZR), other than the work on positioining (which is interesting), suggests that IF park factors are not very significant. You can even limit your baseline to artificial turf fields if you like, as there were plenty back then, and most if not all of them used Astroturf. I see that you want to be as clear, "clean," and as "simple" as possible, but just keep in mind the potential sample size issues you get when you comparing a players numbers to a baseline which is only based on 23 to 236 opps, especially when you break it down the way you do (into 3 "zones.") Certainly Larkin's numbers compared to the opposition numbers when the opposition only has 23 opps is sort of meaningless. For example, at the very least look at the league-wide conversion rate in the hole. If it is 21% or 29%, then you know that the 25% for the opposition is kind of a worthless number to compare Larkin to. Of course if you do that, you are going to end up wanting to use league-wide numbers. When you start breaking everything up into small pieces (for example, in UZR, there are dozen's of zones/speed of ball/etc.), you end up having to use many years of league-wide data to compare a player to. Again, nice work!
Mike Green - Monday, January 31 2005 @ 04:25 PM EST (#2038) #
Thanks, MGL and Tango.

I commented that this pbp efficiency was essentially pitching-staff-neutral in the context of a comparison with the rough assists/inning method. By that, I meant that this measure would be essentially unaffected by the K rate or G/F tendencies of Larkin's pitching staff as compared with assists/inning method.

One could certainly use league averages, to obtain the more reliable baseline data for each zone, and then adjust for park. Another possibility is to use opposition data over a period of years. I would love to see Larkin's 1991 UZR for comparison purposes, especially with a breakdown by zone.

One of the difficulties I had is that the retrosheet events file does not have speed of ball information. It is a signficant issue for shortstops; for third basemen, it is key (due to the shorter reaction time and the absence of discrimination in the file between the ground ball down the line and the ball directly at the third baseman if the play is made).
_MGL - Monday, January 31 2005 @ 07:32 PM EST (#2039) #
Got it! UZR has Larkin -3 in 1991 in 121 defensive games. I don't have the breakdowns by zone handy.

While the speed of the batted ball is good to know, it is not going to affect the results all that much, at any position. Ditto for outs, baserunners, handedness of the batter (both for positioning and as a proxy for the speed of the batted ball), etc.

According to UZR, Larkin was roughly an average SS during his career, although over the last few years, he has been pretty bad, as expected...
Mike Green - Monday, January 31 2005 @ 08:45 PM EST (#2040) #
That's interesting. If you find the breakdowns by zone for Larkin's 1991, I'd be very grateful if you could e-mail them to me, by clicking on my name.
Barry Larkin's 1991 defensive efficiency- groundballs- a play by play analysis | 10 comments | Create New Account
The following comments are owned by whomever posted them. This site is not responsible for what they say.