|
||||||||||
|
||||||||||
27 May 2007, 22:42 (Ref:1922417) | #1 | ||
Subscriber
Veteran
Join Date: Jul 2001
Posts: 3,281
|
Bootstrapping F1 (stats)
Bootstrap is an interesting(?) statistical procedure to try to extract info from a sample. It consists in resampling a sample sampling it many times to construct more samples (that was simply a silly sample of words!).
Basically it takes a sample and see how it could have been in other circunstances keeping the same pattern. So applying it to a driver's scores through a season it can give what scores he could get if the championship were repeated virtually. I have done this calculation repeating 1000 times the 2004, 2005 and 2006 and, perhaps more relevant for us these days, how this 2007 season can end. So, with 1000 virtual seasons we get: For 2004: MS wins the title 97% of times RB wins the title 3% of times For 2005: FA wins 81% KR wins 19% For 2006: FA wins 74% MS wins 26% And now for this season, I have randomly replicated the first 5 GPs and "created" the resting 12 virtual GPs of the year. The total of 17 races are counted and the "title" assigned. Doing it 1000 times, the approximate results are: FA wins the title 53% LH wins the title 44% FM wins the title 3% KR wins the title <1% Surely is surprising Massa has so few opportunities for the title. Most of this percentage is from his results, not because of his (small) difference in points. As I said, bootstrap works supposing the pattern is continued in the future. If a team makes a breaktrough in the car, it can change. However, historic data tends to show form along the season usually doesn't change a lot. |
||
|
27 May 2007, 22:48 (Ref:1922419) | #2 | ||
Ten-Tenths Hall of Fame
Veteran
Join Date: Apr 2001
Posts: 5,181
|
Interesting. But I suppose if Massa wins in Canada, with Hamilton second (where else would he finish) and Alonso third, then the new bootstrap calculation would change drastically, no?
|
||
__________________
"And the most important thing is that we, the Vettels, the Bernies, whoever, should not destroy our own sport by making stupid comments about the ******* noise." - Niki Lauda |
27 May 2007, 23:26 (Ref:1922439) | #3 | ||
Subscriber
Veteran
Join Date: Jul 2001
Posts: 3,281
|
If Massa wins, LH is 2nd, Alonso is third and (!) Kimi is 4th it changes considerably but not so much for "poor" Massa
LH 60% FA 24% FM 17% KR 0% (Decimal rounding makes they don't add to 100%) |
||
|
27 May 2007, 23:28 (Ref:1922441) | #4 | ||
Subscriber
Veteran
Join Date: Jul 2001
Posts: 3,281
|
(BTW, I messed the title. It's "bootstrapping" not "boottstraping")
|
||
|
27 May 2007, 23:31 (Ref:1922444) | #5 | ||
Ten-Tenths Hall of Fame
Veteran
Join Date: Apr 2001
Posts: 5,181
|
Now i am surprised! What does Massa have to do to turn around his terrible season?
|
||
__________________
"And the most important thing is that we, the Vettels, the Bernies, whoever, should not destroy our own sport by making stupid comments about the ******* noise." - Niki Lauda |
27 May 2007, 23:34 (Ref:1922449) | #6 | ||
Subscriber
Veteran
Join Date: Jul 2001
Posts: 3,281
|
He, he, he
Probably he must learn how to trick people (or at least not be tricked by rookies!) |
||
|
27 May 2007, 23:37 (Ref:1922452) | #7 | ||
Ten-Tenths Hall of Fame
Veteran
Join Date: Apr 2001
Posts: 5,181
|
For sure he has to not put his faith in statistics
|
||
__________________
"And the most important thing is that we, the Vettels, the Bernies, whoever, should not destroy our own sport by making stupid comments about the ******* noise." - Niki Lauda |
27 May 2007, 23:44 (Ref:1922453) | #8 | ||
Subscriber
Veteran
Join Date: Jul 2001
Posts: 3,281
|
I employed the term "decimal rounding" but if I were from Brazil (Copacabana, Ipanema...) "rounded things" would have a very different non numerical meaning
|
||
|
28 May 2007, 03:27 (Ref:1922502) | #9 | |||
Veteran
Join Date: Jan 2003
Posts: 1,707
|
Quote:
|
|||
__________________
"If there's anything more important than my ego around, I want it caught and shot now" Douglas Adams. 1952-2001 |
28 May 2007, 04:21 (Ref:1922510) | #10 | ||
Subscriber
Veteran
Join Date: Jul 2001
Posts: 3,281
|
Poor Kimi, nobody likes him
It's surprising the low probabilities Ferrari as a team gets. |
||
|
28 May 2007, 12:30 (Ref:1922788) | #11 | ||
Veteran
Join Date: Nov 2006
Posts: 561
|
Could you run this analysis for a past season after 5 races, to see how it works out?
For example, the results of the 2003 season after 5 races would be interesting to see. |
||
|
28 May 2007, 13:56 (Ref:1922873) | #12 | ||
Subscriber
Veteran
Join Date: Jul 2001
Posts: 3,281
|
I will do the last season (2006). In the first part of the season, the results will be unreliable because the relative lack of data, but as we approach to the end, the percentages will tend to concrete more meaningfully (with the natural fluctuations due to random events).
When I get the results, I'll put them here. |
||
|
28 May 2007, 14:05 (Ref:1922877) | #13 | ||
Race Official
20KPINAL
Join Date: Dec 1999
Posts: 21,606
|
Statistics are always interesting... they give so much to talk about between 2 races.
I like the idea of resampling samples by sampling. |
||
__________________
Show me a man who won't give it to his woman An' I'll show you somebody who will |
28 May 2007, 14:07 (Ref:1922880) | #14 | |||
Subscriber
Veteran
Join Date: Jul 2001
Posts: 3,281
|
Quote:
|
|||
|
28 May 2007, 14:13 (Ref:1922885) | #15 | |
Rookie
Join Date: Oct 2006
Posts: 22
|
Lies.D*mn Lies and Statistics!
Keep them comming Schummy.Thanks. |
|
|
28 May 2007, 14:33 (Ref:1922898) | #16 | ||
Retired
20KPINAL
Join Date: Aug 2004
Posts: 22,897
|
Quote:
|
||
|
28 May 2007, 22:50 (Ref:1923313) | #17 | ||
Veteran
Join Date: Aug 2005
Posts: 1,964
|
Schummy, you never cease to amaze me
|
||
__________________
Hah! |
28 May 2007, 23:22 (Ref:1923323) | #18 | |||
Subscriber
Veteran
Join Date: Jul 2001
Posts: 3,281
|
Quote:
|
|||
|
28 May 2007, 23:55 (Ref:1923334) | #19 | |
Retired
20KPINAL
Join Date: Aug 2004
Posts: 22,897
|
|
|
|
30 May 2007, 03:58 (Ref:1924322) | #20 | ||
Subscriber
Veteran
Join Date: Jul 2001
Posts: 3,281
|
Here are the results applying bootstrap successively after each GP in 2006. The only drivers with a significant probability of winning the title were FA and MS (Alonso started the season very strongly, he "eliminated" many contenders early).
After GP 1 - 11 FA 100% MS ·0% 12 GER FA ·97% MS ·3% 13 HUN FA ·92% MS ·8% 14 TUR FA ·97% MS ·3% 15 ITA FA ·66% MS 34% 16 CHI FA ·56% MS 44% 17 JAP FA 100% MS ·0% After Hockenheim, Schuey was a possible (albeit remote) contender. After Monza, MS appeared as a strong contender for the first time (34%). The win in China puts Schumacher even higher (44%), essentially equaling Alonso. Suzuka was a rude turn of events and Michael lost all his possibilities. Logically (and intuitively) an important result in one of the very last races can turn completely the probabilities for the title. |
||
|
30 May 2007, 13:32 (Ref:1924644) | #21 | |||
Team Crouton
20KPINAL
Join Date: Oct 2001
Posts: 39,934
|
Quote:
Nah. He's just unbelievably and mind-numbingly dull....... |
|||
__________________
280 days...... |
30 May 2007, 18:22 (Ref:1924847) | #22 | |||
Veteran
Join Date: Nov 2006
Posts: 561
|
Quote:
The jump after Turkey is simply so much more than it should be at that stage of the season; that a swing can be that big based on 1 result makes all previous results appear relatively meaningless. |
|||
|
30 May 2007, 19:51 (Ref:1924902) | #23 | ||
Subscriber
Veteran
Join Date: Jul 2001
Posts: 3,281
|
"Slowly moving closer to 100%" it is generally not possible because last races can have a big impact in final status.
Look at Japan. Before it FA and MS had 126 points and probabilities approx equal. After it the only possible outcome was FA as WDC, but if MS had won in Japan and FA retired, then the only possible outcome would be MS champion. As you can see in one race the probability can go from 0% to 100% depending just on that particular GP. In Turkey a maximum result happened: MS won and FA retired, it impacted greatly, but not decisively the probs. That Turkey event was unlikely to happen, so it was not "forecasted" in earlier probs. In the first races, bootstraps are unstable because there are few data, in the last races they are unstable because the *real* probabilities to be estimated are unstable themselves. The only way one can safely and smoothly forecast a WDC is if he build gradually an advantage along the season and he is safe in the last races. In other order of things (but related) making simulations of seasons with simulated data using current point system one can see that the "best" driver/car gets the title less than 50% of times, due to random circumstances. That means predictions usually cannot be too precise. |
||
|
31 May 2007, 11:13 (Ref:1925339) | #24 | ||
Veteran
Join Date: Jul 2005
Posts: 2,195
|
Hey Shummy, how come you don't play the F1 Prediction Comp? With all your statistical ruminatin', should be a slam dunk (theoretically). No?
|
||
__________________
Give me a drink don't be talking so much you're a pain in the butt - Mick |
31 May 2007, 11:48 (Ref:1925354) | #25 | ||
Subscriber
Veteran
Join Date: Jul 2001
Posts: 3,281
|
He, he, he
The simple answer is "no" , I would not win it. In fact I run a game in the Bike Forum and, well, results are... irregular I never calculate for games (except if there is money involved ), it takes out part of the fun of planning the guesses. Moreover, calculations only could (if all) improve the probability of win, but is not a guaranty. What is fun is I use a fictional "bot" in the game, who plays according an automatic procedure. This bot (affectionately called BOTTY) often is one of the best |
||
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
F1 Engine Stats | alonso11 | Formula One | 5 | 7 Aug 2005 19:50 |
Stats | Osella | Formula One | 6 | 24 Mar 2003 14:57 |
F1 stats | neilap | Formula One | 6 | 29 Jul 2002 19:27 |
F1 Stats: Reliability and some probabilities for this season | Schummy | Formula One | 15 | 28 Mar 2002 19:53 |
F1 Stats | Rich Reibel | Formula One | 3 | 14 Aug 2000 01:40 |