Clay Davenport discusses Regression Towards the Mean, as it applies to a team's true talent level. My heart was sent aflutter. There is no more important topic on analyzing baseball performance numbers than sample size and regression towards the mean.

It's a great stab by Clay, but he goes slightly wrong. Here's an "open letter" to Clay that I hope will foster additional discussions among readers.

Clay,

There are two issues with regards to regression towards the mean:

1 - Regressing the sample performance to reflect the likely true rate

2 - Determining the error range around that true rate

The common factor among both of these is the sample size. The more games you have, the less you regress, and the narrower your error range.

I put out a study on clutch hitting that made use of this issue:

http://www.tangotiger.net/clutch.html

On a thread at Primate Studies last year, we went into details as to exactly how to capture the above two issues. I would have to say that a true rate, after 140 games, where the error range is 1 SD = .100 is flat out wrong. Just taking a guess, and I'd say it should be under .050, and more likely something like .030. When I get a chance, I'll work out the exact numbers.

Thanks, Tom

It's a great stab by Clay, but he goes slightly wrong. Here's an "open letter" to Clay that I hope will foster additional discussions among readers.

Clay,

There are two issues with regards to regression towards the mean:

1 - Regressing the sample performance to reflect the likely true rate

2 - Determining the error range around that true rate

The common factor among both of these is the sample size. The more games you have, the less you regress, and the narrower your error range.

I put out a study on clutch hitting that made use of this issue:

http://www.tangotiger.net/clutch.html

On a thread at Primate Studies last year, we went into details as to exactly how to capture the above two issues. I would have to say that a true rate, after 140 games, where the error range is 1 SD = .100 is flat out wrong. Just taking a guess, and I'd say it should be under .050, and more likely something like .030. When I get a chance, I'll work out the exact numbers.

Thanks, Tom

Regression Towards the Mean| 30 comments | Create New Account