Friday, September 10, 2010

Good Data, Bad Conclusions


As someone with a background in science and mathematics, I often read reports and studies and question the conclusions that are reached. In all fairness, true studies tend to be pretty careful, but we typically hear about these reports via a synopsis over public media, which inevitably leaves out the details and often draw obvious but bad conclusions from the study results.

So today I came across Allstate's annual Best Drivers Report. The full list of the city-by-city breakdown can be found here. I initially read about it here where the article's author stated "Once again, DC boasts the country's worst drivers." Even Allstate themselves state that the purpose of the report is to "identify which cities have the safest drivers." They further state that a goal of the report is to "facilitate an on-going dialog on safe driving."

So what's the problem? So I can find a number of problems with the methods used in this report such as it assumes that Allstate claims data is representative of all accident, which may or may not be true. It is also possible that, because of rate differences present in each state, safer drivers flock to Allstate in some states but have less of an incentive to do so in others. This would certainly skew the result. That said, I realize that they wanted to put numbers together based on the data they had so I'll ignore this problem.

The real problem is that what this report actually tells us is "in which city is it safest to drive" and not which city has the safest drivers. Here's why:
  • The report definitely states that some cities have higher incidents of accidents than others, but it emphatically DOES NOT tell us why. The report assumes that the only (or primary) cause is the quality of the driver, but other possibilities include traffic patterns, road conditions, density of cars on the road, etc, which the report completely and conveniently ignores.
  • By this report one would believe that bad drivers love big cities since no city with population >1M is less than the 50th percentile. I find this hard to believe. More likely cities with lower density by definition have less opportunity for collisions and greater margins of error than cities with higher vehicular densities.
  • The report suggests that cities on the lower half of the list should "fix" something. Although this may be true, it doesn't necessarily follow. As long as humans continue to drive, there will be accidents. We could take draconian measures to lower accidents to virtually zero by erecting barriers between lanes, instituting a nation-wide 5mph speed limit and stop lights at every corner of every street that only allow one car to proceed at a time. Although this would certainly reduce accidents the cost of doing this would be counter-productive as the nation would come to a stand still. So just because cities in this report could improve, doesn't mean that they necessarily should improve.
  • It also seems to me that it is more likely that roads conditions such as lane size, traffic lights, stop signs, visibility at intersections and other factors are more likely to yield improvements to safety than just telling drivers they need to drive better. It seems this report is better targeted at city transportation departments than individual drivers.
That said, I like the report in that it does tell us the relative safeness of driving in various cities. It also gives us some clue as to which cities might want to evaluate if changes are in order. But it definitely does not tell us which cities have bad versus good drivers.

No comments: