According to the statistical study, Juan Manuel Fangio's F1's best of all time By Unknown - Museo Juan Manuel Fangio, reimpreso en "La fotografía en la historia argentina", Tomo I, Clarín, ISBN 950-782-643-2, Public Domain, https://commons. wikimedia.org/w/index.php? curid=3934090 |
The first few on the study's all-time driver ranking - Juan Manuel Fangio top, followed by Alain Prost, Fernando Alonso, Jim Clark and Jackie Stewart - are hardly hideous. Michael Schumacher appeared low in ninth but when he was considered from before his first retirement in 2006 only he shot up to third, which again looked fair enough (as an aside, a curiosity about the reporting of this study is that there is more than one list floating about, in addition to the pre/post Schumi lists the one presented in the academic paper has Alonso sixth rather than his widely-reported placing of third, and it's not explained by the Schumi shift apparently as the Daily Mail article at least shows Alonso fourth when Schumi from 2006 and before only is considered).
But then it gets patchy. Stirling Moss is but 35th in the ranking (some 12 places behind Marc Surer) while the likes of Niki Lauda, Nigel Mansell, Alberto Ascari, Jochen Rindt and Gilles Villeneuve are simply nowhere to be seen in the top 50. It all gets, um, a little more interesting too as Christian Fittipaldi is in the elevated position of 12th best driver ever while the luminary that is Louis Rosier is placed 19th. And unless this pair were against just about all assessments in fact secret F1 geniuses never given their break in a good car - in addition to the drivers listed above being actually vastly over-rated, again contrary most assessments - it would seem the study has some shortcomings.
So what's going on? Having read the headlines, and heard the ire in response, I sought out the fuller academic paper on this study, called Formula for success: Multilevel modelling of Formula One Driver and Constructor performance, 1950-2014 - which is available online. Back in my university days I did for a brief spell similar sorts of statistical modelling (though about politics, rather than about F1 - and as with many things done at uni it seems rather distant now some years on) and this I thought combined with my F1 knowledge (stop laughing at the back) would give me a good chance of understanding what they had actually done and therefore by extension how they had come up with their ranking.
In the paper's own words the scientific model aims to "find out which driver, controlling for the team that they drive for, is the greatest of all time". It also seeks to judge how the influence of the team as opposed to the driver alone in F1 success has varied over time (it's grown, unsurprisingly) as well as in a slightly curious detour looks at who is the best wet-weather driver (they conclude it's still Fangio) and the best on specific types of track such as permanent and street circuits.
Yet the first thing to keep in mind in all of this is that for all of its merits, and as outlined sometimes sheer reverence, science cannot do magic. And any scientific model is only as good as the data fed into it, as well as is only as good as its assumptions applied. Garbage in; garbage out, as the phrase goes.
And there's some of it here. You don't have to read too long into the paper to find our first potential sticking point. If you thought the study had access to some mysterious data we'd never clapped eyes on before then you're wrong, the 'outcome' they use as their measure of greatness in their model - the 'dependent variable' as scientists call it - are the championship points each driver has scored in F1.
10-6-4-3-2-1) with fractions used for finishing positions lower than sixth, and the size of the overall field which has varied over time is controlled for. All of which is fair enough. Using points and finishes as your measure also has the benefit of completeness, given all F1 world championship results since the start of the championship in 1950 are easily accessed.
Yet some thinking about it all uncovers problems. As I outlined in a recent article for Grand Prix Times even the measure of finishing and points totals can be a crude, perhaps misleading, measure of driving quality, given the influence of dumb luck. As I demonstrated too in an article for Vital F1 last year the proportion of F1 cars that reach the end of races has increased over time, given cars' reliability and general robustness has improved, while perhaps too modern circuits don't punish error as ruthlessly. So it's not a consistent measure on this front either.
And the paper explains also that failures to finish are treated pitilessly by the model, in effect as low finishing places. If you retired first from a race for the purposes of this study you finished last.
The academics in the paper claim to account for this though. "We do not need to treat driver failures and team/car failures differently" it says, "the model will automatically apportion the latter into the team or team-year levels so they will not unfairly penalise a driver who suffers such failing".
I'll explain more of this later but here, in other words, it seems to assume that all cars within the same team will have the same reliability, so the model will only punish a driver if they have more DNFs than their stable mate, and if that's the case that'll be because of something the driver's doing wrong. It's better than nothing I suppose but it doesn't necessarily cover everything, given for example that for much of the sport's past it was demonstrable that the 'number 2' machine in a team would have less care and attention and therefore would make the finish less often. This applies especially to teams with lower budgets, but even at the front, and even in the modern age, some of us have been given cause to muse "why does it always happen to Rubens? To Felipe? To Kimi?" Sheer random chance of mechanical woe we've mentioned too. Whatever is the case, the cruise and collect pilots in this study are rather generously rewarded.
The unlikely figure of Louis Rosier is 19th in the ranking By Noske, J.D. / Anefo [CC BY-SA 3.0 nl (http:// creativecommons.org/licenses/by-sa/3.0/ nl/deed.en)], via Wikimedia Commons |
But in terms of what the model does, statistical models like these work on the basis of the ability of a certain piece of data to predict an outcome of interest. You take the outcome (in this case, a driver's points) and throw in various types of data, or 'variables', into the model you think are related to that outcome. Then you measure the significance of each individual variable by measuring its ability to predict the outcome from just knowing that data, holding all other types of data in the model constant.
Still with me? Good. And from what I can tell the main 'types of data' in this model are the driver's points, the team's points (i.e. what the team has done in F1 since the dawn of time, and we can argue too as to the extent that's a helpful measure) and what the authors of this study call the 'team-year', which is the points the team got in the particular years that the driver was there. So in this case to take the example of Fernando Alonso, the model measures how many points Alonso 'should' have in his F1 career just from knowing the teams he drove for as well as what those teams did generally in the years he drove for them. And what Nando's actually got in reality over and above that is taken as Nando's personal contribution. And therefore his measure of greatness.
All sounds reasonable enough, but there's a problem. Which is the rather titchy base size of all this. The model would likely be more valuable if several people drove any F1 car in a season, but as we know for the most part there are only two, and the 'team-year' part therefore is in effect a comparison of just two pilots. And as the paper acknowledges almost apologetically, "the model really tells us how drivers perform against their team mates".
The study measures essentially how good drivers such as Fernando Alonso are at beating their team mates Photo: Octane Photography |
Further complicating matters sometimes team mates are chosen pretty much explicitly on the grounds that they won't challenge the top driver - reflected in the sport's maxim of the follies of putting two roosters in the same hen-house. It all rather muddies the waters given this study purports to control for the effect of the team. In other words, those paired up with a idiot, perhaps at a tail end team, start with an advantage as far as this study's rankings are concerned.
The paper suggests indeed this is why Christian Fittipaldi ended up in the haughty position that he did in their ranking, as he "consistently outperformed his team mates, and because he never raced for a 'good' team, the standard required to get a high ranking is lower. More specifically C. Fittipaldi's team mates had relatively high rates of retirement: he gains his high ranking by being able to successfully keep a relatively poor car on the track". They might have added what I outlined above that teams towards the back may be less likely to produce two cars to the same level of mechanical preparation, which may have aided Fittipaldi to "outperform his team mates" and "keep a relatively poor car on the track", working on the premise that he often was his team's 'number one'.
Another problem is that the playing field between you and your team mate will not necessarily be level - plenty of F1 drivers, most great ones, have benefited from strict number one treatment after all.
Moss's low positioning perhaps indicates another of the study's flaws. I dare say that being paired with Fangio for a year in 1955 didn't help him in this, neither that he later spent two years in fast but unreliable Vanwalls with strong team mates. But there's a bigger possible problem. Indeed the paper hints at it ever so gently when explaining its auxiliary study of how the relative influence of driver and team has changed over time, in which it's used data from 1979 onwards only. "The reason for this" it says, "is that, prior to this date, the team-structure of F1 was less clearly defined". But it hasn't taken this to its logical conclusion.
Stirling Moss has not been well rewarded in this study By AngMoKio - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/ w/index.php?curid=37061774 |
We may also be able to get the beginnings of our explanation of why some such as Lauda and Mansell, routinely considered great, haven't shown up well in this particular study. In addition to having long spells in their careers when they weren't trouncing their team mates, I'd imagine the overall team legacy variable hit them given they both drove for teams at hardly the most auspicious periods of their existences (in Nigel's case Lotus, in Niki's BRM and Brabham).
There is too a general problem of low bases which may explain the rather volatile outcomes. As mentioned already drivers for the most part have only one team mate, and perhaps too there's a problem that Grand Prix careers aren't sufficiently long to give us a robust base size in our data. After all our man at the top Fangio only started 51 Grands Prix; Alberto Ascari who missed out altogether only 32. Even the very longest F1 career so far only has 300 odd races. I dare say there aren't many statistical analyses published that are based only on 51 cases. Or even on only 300.
As for team-year, as my recent Grand Prix Times article outlines a single season isn't enough time for bad luck with mechanical unreliability and otherwise to even out. Bear in mind too that the first F1 season had just six races in it...
Indeed. The biggest problem this study has is that its subject matter of F1 is extremely complicated, and there are plenty of things influencing it - known and unknown - that for the most part we cannot possibly begin to hope to measure. The more you delve into the past the harder it becomes too. Given these when judging quality we rely on qualitative assessments rather than only on the quantitative.
No comments:
Post a Comment