Batter's Box Interactive Magazine Batter's Box Interactive Magazine Batter's Box Interactive Magazine
I have released the 2005 forecasts for all 1800 players who played MLB last year. They are based on the Marcel the Monkey Forecasting System, which is the most basic forecasting system you should expect.

The 2005 Marcels

I would also hope that other forecasting systems are much better than this one. But, the jury is still out on that one. I've also left in the player ID that you can find on the Lahman database, which should make this much easier for you to link to that database, or to any other forecasting system. (Hint: if someone provides you with forecasts, ask for the playerid being added in. It makes it so much easier to do a quick comparison.)

The 2005 Marcels | 18 comments | Create New Account
The following comments are owned by whomever posted them. This site is not responsible for what they say.
Gerry - Tuesday, January 25 2005 @ 04:52 PM EST (#2142) #
Great news about a book, I will be a buyer for sure.
_Jordan - Tuesday, January 25 2005 @ 05:53 PM EST (#2143) #
A book co-written by MGL and Tango? Wow. I'm not sure I'd understand it, but I'll definitely want to read it.
Craig B - Tuesday, January 25 2005 @ 10:43 PM EST (#2144) #
Having more concrete news is great. Maybe you guys should call the book "The Book About Baseball" instead? :)

Anyway, the Marcels are actually pretty darned useful.
_studes - Wednesday, January 26 2005 @ 09:53 AM EST (#2145) #
Tango, this is great. May I ask -- how do you calculate your reliability score?
_Jim - Wednesday, January 26 2005 @ 03:24 PM EST (#2146) #
A. Soriano is 29 not 27. Don't know if that changes his Marcel.
_Michael - Thursday, January 27 2005 @ 03:51 AM EST (#2147) #
In case Tango doesn't stop by for a while IIRC from primer studies days:

Jim - It does as Tango's Marcel system is misnamed IMHO as it is actually a pretty complex (as in a number of steps and factors incorporated - it is *not* "most basic forecasting system you should expect" if you is an average fan and by expect you mean expect for free. A projection system that projects either the player will do what they did last year or the player will do what they have done in career to date (just straight average, no adjustments for anythying) is way, way simpler and gets you a good (I.e., useful) chunk of the way to what Marcel gives you) and age adjustments are one such effect and 29 is worse than 27. I think 27 gives you a slight improvement or a 0 over your established baseline while 29 gives you a slight decline over the established baseline (the peak over the baseline is a little later than the usual "peak" year because if the true "peak" year is 26 your baseline in Marcel is based on your last 3 seasons so when you are 27 and past your peak your baseline is based on your 24, 25, and 26 seasons (plus more including regression to league average season) and as a result even if your expecting to decline from last year you may still be expecting to increase from your "established baseline" which is based on all three years).

Studes - reliability is in essence how much regression to the mean was needed. The more playing time in the past then the less regression to the mean needed and the greater the reliability score. I don't remember Tango giving a way to translate reliability scores plus projections into percentiles or anything similar to what PECOTA does.
_tangotiger - Thursday, January 27 2005 @ 07:33 AM EST (#2148) #
reliability is 1 minus regression towards the mean.
_Jim - Thursday, January 27 2005 @ 08:23 AM EST (#2149) #
It's interesting that neither PECOTA or Marcel predicts anyone will hit 40 home runs this season. If that ever happened the anti-steroid writers will have a field day.
Gitz - Thursday, January 27 2005 @ 12:46 PM EST (#2150) #
It's pretty hard to take PECOTA's counting numbers seriously. To use just one example, Mark Teixeira is projected to hit 29 home runs. He hit 38 last year, and while he may not get better than that, he'll likely end up with at least 35.

Where PECOTA gets it right, when they get it right, is in the peripherals: slugging, OBP, etc. But it's unreliable, at best, for counting numbers.
_Jonny German - Thursday, January 27 2005 @ 01:12 PM EST (#2151) #
But it's unreliable, at best, for counting numbers

I think you'd find that if you take a couple thousand PECOTA projections for a counting stat, say home runs, it would have similar accuracy to what it has for rate stats. Maybe Teixeira is a safe bet to blow away the 29 HR prediction, but he'll be balanced by another player who projects to hit 20 HR but gets injured in April and ends up hitting 4 HR for the season. The possibility of serious injury or sudden unexplained ineffectiveness is built into PECOTA.
_Jim - Thursday, January 27 2005 @ 01:21 PM EST (#2152) #
It just seems that PECOTA is too conservative across the board - almost too much regression. Then of course there is the projection for Dustin Pedroia.
Gitz - Thursday, January 27 2005 @ 01:22 PM EST (#2153) #
I should be clear: I like PECOTA, I really do. Any system that gives Jeremy Giambi a 33 percent chance to collapse -- as PECOTA did a few years ago -- is worthwhile. But notwithstanding your valid point, I don't pay much attention to its raw stats.
_tangotiger - Thursday, January 27 2005 @ 10:43 PM EST (#2154) #
Marcel's also got Teixiera at 29 HRs. And, Marcel's regression towards the mean is exactly perfect.

Look at all the guys forecasted for 28 to 30 HRs: Beltre, Sheffield, Delgado, Teixeira, Andruw, Soriano, Tejada, HElton, Berkman, Konerko, Palmeiro, Burnitz, Beltran.

Half of those guys will hit more than 29 HR, and half will hit less. But, you, me, and everyone else has no idea who will hit 30 or 35 HR. Bad luck, good luck, injuries, whatever... everything plays a role in this.

Marcel's best guess is that those 13 hitters will average 29 HRs.

If you wanted Marcel to forecast number of HR without attaching names to it, that'd be alot easier, and the range would be wider.

Think of these forecasts as over/unders.
_Sky - Saturday, January 29 2005 @ 04:52 PM EST (#2155) #
These projection system are NOT claiming that nobody will hit 40 HRs. In fact, I bet both Marcel and PECOTA would put that probability upwards of 90%. They're claiming that if you pick any one specific player, they have less than a 50/50 chance of hitting 40 HRs (sort of).

Think of them as expected values. Or what makes sense in my mind - if the season were played 1000 times, what's the average number of HRs that Mark Teixera would hit? 29's a pretty decent number.
_tangotiger - Monday, January 31 2005 @ 08:14 AM EST (#2156) #
Right, good explanation.

I'm 90% sure that SOMEONE will win the lottery tonight. However, I'd give the chances of each of the 1 million players virtually 0% chance to win the lottery.
TangoTiger - Wednesday, January 18 2006 @ 04:49 PM EST (#139953) #
Look at my comments, 3 posts (and 1 year) earlier. Ok, so I decided to find out what those 13 guys I listed did. Here they are:

51 Andruw
43 Teixeira
40 Konerko
36 Soriano
34 Sheffield
33 Delgado
26 Tejada
24 Berkman
24 Burnitz
20 HElton
19 Beltre
18 Palmeiro
16 Beltran

6 guys with more than 30, and 7 less than 28. Average? 29.5.

For this on the fence on forecasting systems, I hope this little exercise makes it clearer.

Mike Green - Thursday, January 19 2006 @ 09:29 AM EST (#139992) #
Well done. Tango, I believe that your book with MGL can now be ordered. Could you let interested Bauxites know where to go to order?
TangoTiger - Thursday, January 19 2006 @ 09:40 AM EST (#139995) #
I'll have an announcement on my site soon. Thanks for the interest!

The 2005 Marcels | 18 comments | Create New Account
The following comments are owned by whomever posted them. This site is not responsible for what they say.