Improving Wikipedia's Tour de France Prize Money Plot
The Tour de France is the most important bike race of the year, and it is therefore the race with the most prize money awarded. Wikipedia has this plot showing how that prize money has grown over the years:
The plot is pretty good, at least at first glance! It is (appropriately) a log plot.1 It labels all of its pieces. It even has gaps for when the race was not held. But the plot also has a few problems:
- The X-axis is wrong; the race did not start before 1900 and the two gaps are from the World Wars which did not happen in 1898 and 1921.
- The text is too small to read easily at Wikipedia’s default 200px image size.
- The axis labels are redundant and tick labels have a lot of zeroes.
I decided to fix it up using data from Bike Race Info, like I did last time when I fixed the Hour Record Plot.
Improvements
Here is my version:
The code that generated the improved plots can be found here (rendered on Github). The data is here, and the code that cleaned the data is here (rendered on Github).
I fixed the X-axis so that the dates are now correct! The first race is in 1903 as expected. I have also removed the axis label because I think the tick labels make it clear what is plotted. I have added my (patent pending 😛) grey stripes to the background to indicate each decade.
I changed the Y-axis to be more readable by abbreviating the numbers using K and M. I also removed the label and replaced it with the euro symbol (€) on each tick.
I made all the text larger and the lines thicker to improve legibility when the plot is downscaled. I have also changed from a line plot to a step plot because the amount of prize money changes at specific moments in time, not continually.
Finally, I have cleaned up the data a bit. The original plot used uncorrected Euro even though the original prizes were in old Franc, new Franc, and Euro depending on the year. I have normalized all values to 2013 Euro. I have included this information in the subtitle so that it survives even if the plot is separated from its caption on Wikipedia.
Overall I think it is an improvement, so I have contributed it back to the community here.
-
Inflation is exponential. ↩