Accredited official statistics

Proposed changes to drink-drive methodology: technical note

Published 31 July 2025

Overview

The road safety statistics team has been reviewing the methodology used to estimate the number of drink-drive collisions and casualties. This document gives an overview of the current method and details a new statistical model that has been built to improve estimates of drink-drive collisions which result in a deceased driver.

We are using this document to consult expert users of the statistics and other statisticians to gain their insight on whether to implement these proposed changes into the process for producing the drink-drive statistics.

Subject to feedback received, we intend to implement the new method in the production of the 2024 drink-drive statistics which are scheduled for publication in July 2026.

We welcome any comments from users of the statistics via our short .

Current methodology

The current methodology is detailed in the methodology note. In summary, due to missing data, the statistics calculated from the data available need to be scaled up to estimate the actual number of drink-drive collisions and casualties.

There are two types of data that are used:

  • Toxicology data for deceased drivers or riders

  • Breath test data for drivers who did not die in the collision

The toxicology data can be missing due to samples being unsuitable due to being collected more than 12 hours after the collision or simply not being returned by coroners or procurators fiscal. Breath test data can be missing for drivers who left the scene of the collision before a police officer attended.

To scale the statistics up, we calculate scaling factors and apply it to the relevant collisions and the casualties in those collisions.

  • One scaling factor is calculated for collisions with a deceased driver who was detected to have a blood alcohol level above the legal limit.

  • Three separate scaling factors are calculated for collisions with a failed or refused breath test and no deceased drivers or riders over the limit, one for each severity classification of the collisions, fatal, serious or slight.

Scope for improvement

Due to the lack of data available for drivers who left the scene, it is difficult to improve that part of the scaling methodology.

However, there is data available for the deceased drivers who we didn바카라 사이트™t receive a suitable blood alcohol value for. By using information about the driver and the collision, we can estimate the probability of them being over the drink-drive limit and give better estimates for the overall number of drink-drive casualties than just applying an overall scaling factor which does not account for the known characteristics of the driver.

Proposed method

We identified logistic regression as an appropriate statistical method for predicting the probability of deceased drivers being over the drink-drive limit due to the binary response variable (over or under the limit). It also has the ability to include multiple explanatory variables, which would help account for the characteristics of the driver and the collision.

The STATS19 database has a lot of data for each collision so, to keep the model manageable, we identified a collection of variables which seemed relevant to whether a driver would be more likely to be over the limit. These were:

  • Sex of the driver
  • Age of the driver
  • Road user type
  • Number of vehicles in collision
  • Number of casualties in collision
  • Road class (for example roundabout or single carriageway)
  • Road type (for example motorway or A road)
  • Light conditions
  • Weather conditions
  • Road conditions
  • Number of pedestrian casualties
  • Hour of the day
  • Day of the week
  • Day of the month
  • Month
  • Year
  • Region

We trained a model on the data on drivers for which we received suitable blood alcohol data using all of the variables above. Then, we conducted backward selection to only include significant variables. This was initially done using Chi Squared tests and then once all of the variables were significant, analysis of deviance was used to identify whether variables were significantly improving the fit of the model. Any variables not significantly improving the fit of the model were also removed.

This resulted in a model which included:

  • Hour of the day
  • Number of vehicles in the collision
  • Day of the week
  • Age of the driver
  • Sex of the driver
  • Region
  • Road class
  • Road type
  • Road conditions
  • Day of the month

Finally, to test if we could use a simpler model that was easier to interpret, we removed the least significant variables from the model. The effect of this on the results was tested and seen to make a marginal difference. Therefore, the model was simplified to only account for:

  • Hour of the day
  • Number of vehicles in collision
  • Day of the week
  • Age of driver
  • Sex of driver
  • Region

Model assumptions

  • The binary nature of the dependent variable is satisfied by the nature of the data; drivers can be over or under the limit.
  • The data are assumed to be independent because we would expect that a driver바카라 사이트™s blood alcohol level wouldn바카라 사이트™t be affected by the blood alcohol level of another driver they are in a collision with.
  • Multicollinearity was checked for using variance inflation factors and this didn바카라 사이트™t identify any issues with the model.

Results

Chart 1 below compares the fatality estimates using the current method, the full significant model and the simplified model. It is clear that the is little difference in the two different models, therefore suggesting that we should use the simplified model because it is easier to interpret.

However, both models result in estimates that are different from the current method, most notably in 2015, where there is currently an unexplained dip in the estimated number of fatalities. Both models seem to correct for this anomaly and therefore could be improvements on the current method.

Chart 1: Estimated number of drink-drive fatalities in Great Britain 2013 to 2023 using different methods

Conclusion and next steps

This note outlines the technical details of a proposed change to the methodology for calculating the number of fatalities in drink-drive collisions. While the overall impact of the change is relatively small, we believe it represents an improvement to the current approach as it takes account of more information available to estimate for cases where a blood alcohol reading was not available for a deceased driver, resulting in a more consistent trend over time.

We intend to make this change alongside publication of the 2024 statistics in July 2026, including revisions to previously published figures, but welcome any feedback from users of the statistics or other interested stakeholders.

Instructions for printing and saving

Depending on which browser you use and the type of device you use (such as a mobile or laptop) these instructions may vary.

You will find your print and save options in your browser바카라 사이트™s menu. You may also have other options available on your device. Tablets and mobile device instructions will be specific to the make and model of the device.

Select Ctrl and F on a Windows laptop or Command and F on a Mac

This will open a search box in the top right-hand corner of the page. Type the word you are looking for in the search bar and press enter. Your browser will highlight the word, usually in yellow, wherever it appears on the page. Press enter to move to the next place it appears.

Contact details

Road safety statistics