Driven by technological progress, human life expectancy has increased greatly since the nineteenth century. Demographic evidence has revealed an ongoing reduction in old-age mortality and a rise of the maximum age at death, which may gradually extend human longevity1, 2. Together with observations that lifespan in various animal species is flexible and can be increased by genetic or pharmaceutical intervention, these results have led to suggestions that longevity may not be subject to strict, species-specific genetic constraints. Here, by analysing global demographic data, we show that improvements in survival with age tend to decline after age 100, and that the age at death of the world’s oldest person has not increased since the 1990s. Our results strongly suggest that the maximum lifespan of humans is fixed and subject to natural constraints.
Evidence for a limit to human lifespan
Publons score (from 5 scores)
4.7 | Quality
7.4 | Significance
There is an interesting writeup on this paper, including an interview with the reviewers, online: https://www.nrc.nl/nieuws/2016/12/09/how-weak-science-slipped-past-through-review-and-landed-in-a-top-journal-a1535637
Reanalysis of the evidence for a limit to human lifespan
In this analysis I look at figure 2 specifically which argues that the maximum age of death has plateaued.
I downloaded the data from the International Database on Longevity at the Max Planck Institute for Demographic Research. The terms of the data access do not permit third party sharing so the raw data is not uploaded to GitHub but you can download it yourself if you want to rerun the following analyses.
First I load the data into R, tidy up some of the columns, and subset to the same individuals used the in paper. (Not sure why they didn’t just use all 668 rather than just 534). Here is the breakdown by country:
Now let’s recreate figure 2A.
The authors of the paper fitted two separate regression lines to this data arguing that after 1995 there was a change in the trend (a seemingly arbitrary choice of breakpoint - the choice of a broken vs linear trend has been analysed elsewhere).
You can see from the confidence intervals on the regression lines that the gradient for the second segment is actually consistent with being the same as the first segment. In the paper the authors calculate a p-value of 0.27 for the gradient of the second segment (null hypothesis = 0) and conclude “no further increases were observed”. They apply the same reasoning in a reply to a post-publication review on publons “The latter is not significant, so we conclude that the MRAD is essentially flat”. However, you can not accept the null hypothesis based on p > 0.05, you can only reject a null hypothesis. In this case a p-value of greater than 0.05 suggests that there is not enough data to conclude that the gradient is different from 0 (perhaps the null hypothesis should really by that the gradient is the same as the first segment, although the p-value is still non significant). The 95% confidence interval for the second segment gradient is −0.83 to +0.20 which includes the point estimate of the first segment gradient of 0.15 (using non rounded age values here).
CI: (2.5 %, 97.5 %): (-0.827, 0.196)
First segment point estimate: 0.153
P-value when H0 = 0.1533: 0.068
However this analysis is quite sensitive to the choice of breakpoint. In the above mentioned review response the authors re-analysed the data and found that a breakpoint of 1999 was a better fit. Although the package used (“segmented”) fits a continuous piecewise regression, I will continue using the method above to illustrate a different choice of date anyway. Replotting the regression lines using this breakpoint shows the confidence intervals more clearly supporting the downward trend and the p-value (with H0 = 1st segment gradient) is now significant at the 0.05 threshold. The upper 95% confidence interval is now −0.2 suggesting a downward trend (rather than a plateau).
First segment gradient point estimate: 0.193
CI: (2.5 %, 97.5 %): (0.095, 0.290)
Second segment gradient point estimate & confidence intervals: -0.687
CI: (2.5 %, 97.5 %): (-1.16 -0.21)
P-value for second segment when H0 = 0.193: 0.0040
Higher order maximums (2nd, 3rd etc)
The authors note that due to the fact that each of these data points is just a single individual the apparent plateau they observe could be due to random fluctuation. To strengthen their argument they looked at the 2nd highest reported age at death, 3rd highest etc and claimed that these series showed the same pattern. However the data points were only plotted for the 1st MRAD and only cubic smoothing splines for the remaining. Fitting a cubic spline could be misleading / overfitting and each series should probably be processed in the same manner as figure 2A if one is to conclude that they show the same pattern. Below I plot each series individually so the actual data is visible. The cubic splines show downward trends towards the end although with increasing uncertainty and linearity. Similarly with the linear regressions the gradient of the second segments are lower than the first segments although with increasing consistency between the two (note variable y-axis).
Mean age of death
In another alternate approach the authors looked at all individuals in the dataset to calculate mean age of death and concluded that the annual average age of supercentenarians had not increased since 1968 (the start of the dataset). I recreate their plot below but with the addition of error bars representing the standard error of the mean for each point in order to visualise the uncertainty in the values.
You can see that for the earlier points there are no error bars, this is because there is only a single data point for those years. It is therefore quite misleading to give each mean equal weighting by fitting a cubic spline to point estimates of the means alone.
A perhaps fairer approach is to recreate the graphs but using the whole dataset (note the dataset does not include anyone who died younger than 110). In this form the uncertainty in the first and last few years is much clearer, and the dip pattern fitted above is much less convincing. I would argue that a linear regression fits the data just as well and this gives an increase of ~ 0.04 years per year.
In the study the authors analysed maximum reported age of death (MRAD) over different years but the data for each year was from a different combination of countries and hence the sample size varies. One therefore might expect that the MRAD could change solely due to variation in the sample size (we are more likely to see high maximums when there is more data). Here I investigate the effect of using different sample sizes on the MRAD.
To get an equation for the distribution of age at death we can fit a generalised extreme value distribution to data from the UK Office of National Statistics (which fits much better than a normal distribution).
We will also need to estimate the sample size (number of deaths) for each year in each country. For this I multiplied the world bank crude death rate by population size. We can then see how the total sample size varies over time in the original papers analysis.
The trend is similar to the regression lines they fit and so any bias from sample size would result in an overestimate in their favour for the gradient of both of the regression lines. However the effect of sample size on MRAD is probably not linear - maybe the population sizes used are large enough that the MRAD is effectively independent. With the sample size and an equation for the distribution of age at death we can now calculate the probability distribution of MRAD (more formally the nth order statistic) for different sample sizes. First let us look at the distributions of MRAD for the estimated minimum and maximum sample size used in the study.
This shows we might expect a difference of over a year in the MRAD due to the change in sample size alone (dashed lines indicate mode). We can also look at how the modal MRAD changes over many different sample sizes.
The modal MRAD increases sharply at first and then starts to plateau once the sample size increases to millions of deaths. The dashed lines indicate the estimated minimum and maximum sample sizes used in the study. A double log distribution fits this curve well for reasonable sample sizes (>20).
We can also plot the difference from the mean MRAD for each year in the study based on changing sample size alone.
Hence the sample sizes used would probably have a noticeable although small effect on the MRAD and a correction would slightly weaken the authors conclusions by reducing the gradient of both regression lines. Even though the effect is moderate it would have been nice to see an analysis of this type reported in the study.
Whether or not there is a genuine limit rather than a temporary fluctuation would be clearer if there was more than 7-10 years of data beyond the breakpoint, given it is now a decade on perhaps there is new data available, for example from the USA Death Master File.
This analysis used readily available data and the code used is available at https://github.com/daniel-wells/human-lifespan-limitReviewed by
A cohort is not representative of humanity
In the freshly published research letter , Dong, Milholland, and Vijg (DMV) reported that they found strong evidence for a limit to human lifespan. Analyzing data from International Database on Longevity , they found that the yearly maximum reported age at death (MRAD, i.e. age at death of the world’s oldest person died in a specific year) stopped increasing from the mid-1990-s reaching a plateau at around 115 years. Even though the authors acknowledge that the data on “the supercentenarians <…> are still noisy and made of small samples”, they feel safe to conclude that “the results strongly suggest that the human lifespan has a natural limit”. I argue that the results and conclusions of the study are likely to be caused by just a data artifact, and that they are hardly generalizable for the humanity.
The authors chose to divide the study period at the year 1995, which is an arbitrary decision. Yet, this decision imposed a strong effect on the results and conclusions. The conclusions are based primarily on the basis of liner regression trends for the two sub-periods (figure 1, lines 1 and 2). The main conclusion is derived from the negative slope for the second sub-period (figure 1, line 1), which is largely explained by the high outliers (1997 and 1999) and low outliers in the first and the last two years (1995, 2006 and 2007). When those outliers are omitted, the slope for the second sub-period flattens greatly (regression coefficients for lines 2 and 3 are -0.36 and -0.11).
Figure 1. Reported age at death of supercentenarians
NOTE. The yearly maximum reported age at death (MRAD). The lines represent the functions of linear regressions. All data were collected from the IDL database (all 15 countries included, 1968–2007, n = 668). Unlike DMV, I use all the death records from IDL available on 9 October 2016, not just the data for France, Japan, UK, and US. Unlike DMV, I take MRAD values as age-at-death in days divided by 365.25 to convert into years, not the rounded to years values. The above mentioned data decisions of DMV are not explained and justified in the paper, and, in my view, are not optimal. The outliers are identified with Cook’s distance in two steps. Cook’s distances are 0.57 and 0.10 for the observations 1997 and 1999 in the model 2. After the removal of the high outliers, Cook’s distance for the observations 1995, 2006, and 2007 are 1.50, 0.19, and 0.46, correspondingly.
With the outliers omitted, there is only a tiny difference between the trend lines for the first sub-period and the whole period (figure 1, lines 1 and 4, regression coefficients are 0.15 and 0.12). Seeing how volatile the data are, it seems too hurriedly to drive humanity-wide conclusions based on the presented type of analysis. Imagine, we would now have these data just until 1991. The similarly arbitrary division of the study period at the year 1981 would have shown that the growth in MRAD had stopped and even reverted (figure 1, line 5). We would have then concluded that there was strong evidence for “the limit of human lifespan” at around 113.5 years. Yet, the following one and a half decade would have proven us to be misinterpreting the development of MRAD.
However, the haste of conclusions based on highly vulnerable linear regression estimates is not the only caveat of the presented paper. I believe, it is essentially incorrect to draw conclusions about such an ecumenical concept as human lifespan limit based on so sporadic and erratic data. Most likely, the slowdown in MRAD in 2000-s is a cohort effect. Namely, the difference behind the top-survivors’ data analyzed here is the increase in old-age cohort age-specific mortality rates that took place in the United States in the cohorts born in 1880-s as compared to the cohorts born in 1870-s (figure 2).
Figure 2: Cohort mortality rates in the United states
NOTE. Data are cohort age-specific mortality rates (CASMR) from Human Mortality Database . The lines represent the average CASMR over two groups of birth cohorts, ten 1-year birth cohorts each: born in 1870-s and 1880-s. See the similar graphs for France, Great Britain, Japan, and Sweden in Extended Data Figure 2.
The US had exceptionally low old-age mortality in the cohorts of 1870-s. That is why the majority of MRAD supercentenarians in 1980-s and 1990-s were from the US (see extended data table 1). For some reasons (which are beyond the scope of the present letter), maximal longevity of the US 1880-s births cohorts tuned out worse than that of the 1870-s birth cohorts (with the average difference in mortality rates of 0.05 at the ages 100-110; p-value < 0.0001). Higher mortality in succeeding cohorts as compared to the preceding cohorts is a relatively rare case; as we may see from the Extended Data Figure 2, similar rise in old-age mortality did not happen in France, Great Britain, Japan, or Sweden. So, quite a local effect of the increase in the US cohort mortality was uplifted by DMV to report the presence of a natural limit for human longevity.
Summing up, the presented evidence does not clearly “suggest that the maximum lifespan of humans is fixed and subject to natural constraints”. I believe, we better put faith in the results of demographers (e.g. Vaupel  or Vallin & Meslé ), who draw much more optimistic projections based on the population-wide analyses. The dynamics of human mortality show that increased percentages of human populations reach more and more mature ages; eventually the 115-year “limit” will fall, and, in some years, we will likely speculate over the 120-year limit. And so on.
- Dong, X., Milholland, B. & Vijg, J. Evidence for a limit to human lifespan. Nature (2016). doi:10.1038/nature19793
- Maier, H. et al. Supercentenarians (Springer, 2010). doi:10.1007/978-3-642-11520-2_2
- The Human Mortality Database (http://www.mortality.org, 2016).
- Vaupel, J. W. Biodemography of human ageing. Nature 464, 536–542 (2010).
- Vallin, J. & Meslé, F. The Segmented Trend Line of Highest Life Expectancies. Population and Development Review 35, 159–187 (2009).
Extended Data Figure 2: Cohort mortality rates in France, Great Britain, Japan, and Sweden
NOTE. Data are cohort age-specific mortality rates (CASMR) from Human Mortality Database (<http://www.mortality.org>). The lines represent the average CASMR over two groups of birth cohorts, ten 1-year birth cohorts each: born in 1870-s and 1880-s.
Extended Data Table 1. Yearly maximal reported age at death of supercentenarians
NOTE. The table provides the information on yearly maximum reported age at death (MRAD) extracted from the whole the International Database on Longevity (<http://www.supercentenarians.org>, all 15 countries included, 1968–2007, n = 668) obtained on 9 October 2016. Color background delimitates 1870-s birth cohorts from 1880-s birth cohorts.
INFORMATION ON THIS TEXT
This letter was submitted to Nature in the form of Brief Communications Arrising on 2016-10-12. It was rejected on 2016-10-26. I hope, better justified critics from scientific community will be published in Nature soon. Meanwhile, I encourage the authors to publish their comments on my letter.
The R code to reproduce the analysis and figures presented here cam be found by the link: https://github.com/ikashnitsky/a-cohort-is-not-representative-of-humanityReviewed by
Dear Brandon Milholland,
Thank you for your interesting comments on your paper "Evidence for a limit to human life span", Dong et al., Nature 2016.
May I ask some questions regarding your Figure 1d, which presents the relationship between calendar year and the age that experiences the most rapid gains in survival:
How do you measure the gains in survival in this particular case: as a difference between the numbers of survivors over time, or as a difference between the LOGARITHMS of the numbers of survivors over time? Or something else?
What time intervals do you use, when you measure the gains in survival over time: just one year time interval (data for two close calendar years, say years 1981 and 1982, for example)? Or do you use longer time intervals?
Please advise. Thank you!
P.S.: By the way, you may enjoy our related published study, which explains why the chances for longevity records are much smaller than they were assumed earlier: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4342683/Reviewed by
Comments about Vijg Letter and Olshansky commentary in Nature 5 October 2016, JWV
This publication is another travesty in a century-long saga of asserted looming limits to average and maximum human lifespan. It is disheartening how many times the same mistake can be made in science and published in respectable journals.
A century ago it was believed that average lifespan—life expectancy—would never exceed 65. As evidence to the contrary poured in, the limit was raised and raised again. Olshansky pegged it at 85. Japanese women today, however, can expect to live more than 87 years.
A century ago the maximum span of life was believed to be about 105. Again this limit was increased as people exceeded it. Vijg and Olshansky set it at 115 even though the current record holder, Jeanne Calment, lived 122.45 years: she is dismissed as an “outlier”.
In this sorry saga, those convinced that there are looming limits did not apply demography and statistics to test hypotheses about lifespan limits—instead they exploited rhetoric, deficient methods and pretty graphics to attempt to prove their gut feelings. The publications are essentially propaganda, not scholarly research.
Vijg’s travesty and Olshansky’s commentary on it in the same issue of Nature are further dismal examples. The material was published and is getting publicity because it seems plausible to many people that average and maximum lifespans cannot increase much more. The main evidence is summarized in colorful graphs that are problematic.
- It is claimed that life expectancy is plateauing, approaching a looming limit, but the Figures in Vijg, including Fig. 1a for France and subsequent Figures for Japan, Italy and other large countries with high life expectancies, do not support this. They show a continuing rise in life expectancy albeit, in some cases, at a somewhat slower rate than in some earlier periods. There is no evidence that the slower rate will become an even slower rate and then zero.
- The age at which the most rapid progress is being made in increasing survival is shown to be high—above 100 in recent years—and rising to higher and higher ages. It is claimed that this age plateaued after 1980 but again this is not supported by the graphs. The most important country for the analysis is Japan, a country with a large population and the world’s life expectancy leader. In Japan there is no plateau. Nor is there one for France and Italy, two other countries with large populations and high life expectancies, although there is some deceleration in the rate of increase. Again, there is no evidence that there will be further deceleration leading in the near future to a plateau.
- Data on the maximum recorded age at death are simplistically and without any statistical justification fit by two lines—a rising line and after 1995 a declining line. More powerful methods, including methods from Extreme Value Theory, should have been used to test whether the data imply a decline in maximum lifespan.
Like analogous, disproven publications over the past 100 years, Vijg et al. and Olshansky add nothing to scientific knowledge about how long we will live. The publications are advocates’ arguments based on selective use of data, with one-sided conclusions not supported by the data.Reviewed by