The link between geography and life expectancy can be traced back to the seventeenth century with the publication of John Graunt’s, ‘Natural and Political Observations…Made upon the Bills of Mortality’ in 1662, which established different levels of mortality for rural and city dwellers.
The bills of mortality referred to by Graunt were weekly lists of parish birth and death records which had been published in England since the mid-sixteenth century, but today’s researchers have a more comprehensive set of tools.
The Longevity Index for England (LIFE Index), developed by a team led by Professor Andrew Cairns of Heriot-Watt University, uses modern data science techniques - the random forest algorithm – augmented by machine learning to analyse life expectancy at the neighbourhood level.
Despite the use of different technologies, the end result had one remarkably similar to Graunt’s findings: a distinct difference in urban and rural mortality levels.
The LIFE Index was first published in 2021 and was updated in 2024 to reflect the impact of changing parliamentary boundaries and Cairns says that while the project was initially intended for pensions and insurance providers, it now has a broader utility.
“The starting point seven or eight years ago was that we were looking to see how much variation there was in mortality across the population and was aimed, in particular, at pension applications,” he said.
“And I then realised there was this very rich data set produced by the Office for National Statistics (ONS). That led me off in a slightly different direction in terms of thinking about the inequalities across the population and changed it from a purely pensions perspective to a more general socio-economic project,” he added.
Cairns’ research looked at life expectancy in England and Wales’s 33,000 Lower Layer Super Output Areas (LSOAs), which are the smallest subdivision used by the UK’s Office of National Statistics and form the basis of the Index of Multiple Deprivation.
The analysis was based on 12 separate variables such as the proportion of people above age 65 with no qualifications, the crime rate, the average number of bedrooms, and the urban rural class. Of the 12, two turned out to be of much greater importance.
“Income deprivation and employment deprivation are probably picking up between 80 and 90% of the variation that you see,” he says.
The other important variable is the split between urban and rural populations which Cairns says is responsible for the other 10% of life expectancy differences, depending on how many variables are analysed.
“Other variables, such as the number of bedrooms in a property, can be used as a proxy for urban/rural, so if you use less than 12 variables then urban/rural is important but this diminishes if you use 12,” he says.
According to Cairns, applying machine learning was a critical part of analysing the data as it allows non-linear trends to emerge. Medical trials, for example, typically use the Cox regression analysis which has a purely linear approach.
“The random forest algorithm is able to pick up non-linearities in the data, compared with more traditional methods. In general, it gives a better fit across all of the neighbourhoods, whereas the linear methods fit well in some parts of the data and then don’t fit well in other areas,” says Cairns.
The non-linear approach brought one key trend to light – ‘the healthy immigrant effect’, which is when the proportion of UK born lives gets below 70% of the total population at which point there is a noticeable increase in life expectancy.
“If you were using a more traditional, linear model, it wouldn’t be able to pick that up. Whereas by using machine learning methods as you move across the scale in terms of unemployment, or other factors, what you can find is that one particular predictive variable might be important in one part of that data set, but not important in another.
“And one example that we’ve picked up on where non-linearity is important is in relation to the proportion of people that were born in the UK,” says Cairns.
“Data from Canada, where they have much higher levels of immigration, shows that when people migrate to your country, they are healthy and are coming to take up work. And that good health seems to last,’ he adds.
Cairns emphasises that further research is needed to understand the reasons for this trend, one plausible explanation is the importance of unemployment deprivation in life expectancy, a variable which is typically low among immigrants.
People born outside the UK did have lower incomes in retirement, a data point which Cairns says may be helpful to policy makers.
“A possible inference from this is that a lot of immigrants were perhaps coming into lower paid work. And, while they are less likely to be unemployed during their working lives, when they retire, they might feature in the income support variable.
“So that is maybe different from the average person in the UK, where there’s a stronger association between being unemployed and then requiring income support in retirement,” Cairns says.
One limit faced by the LIFE Index is the ONS data only covers England and Wales, with Scotland and Northern Ireland having slightly different approaches to analysing deprivation. Cairns says his team is looking at how to incorporate additional data from the two countries, but he says that this won’t change the outcomes.
He also says that despite the projects slight change initially to a more expansive remit that provides additional tools for policymakers the research still has significant value for industry experts.
“The index is useful for policymakers because if they are spending scarce resources to improve health outcomes. And our LIFE App provides a great tool that pinpoints the areas that need the most effort.
In the pensions, or life insurance, context it has a slightly different use because these organisations have their own data and resources and they will understand the characteristics of the pensioners, or policyholders, in detail. Nevertheless, we believe that the LIFE Index can be used alongside this other data to enhance valuations and pricing,” he says.