How Should I Think About dark-mode light-mode

A guided tour to the Gompertz law of mortality

Here is a question to you:

What are your chances of dying this year?

This is not an easy question, and you may want to avoid it altogether. But regardless of how we might feel about it, our death is (at the minimum) consequential to us and our loved ones. So let’s be fearless and explore the subject with a cool head.

Fortunately, a sensible answer (mathematically speaking) to our chances of dying this year isn’t that hard to guess.

Choose a sample of people born in the same year, and — as time goes by — take note of their age at death. Some people will unfortunately die during infancy. Most will face death much older. There is an expression, known as the Gompertz law of mortality, that give us a reasonable estimate of how many persons of the sample will die each year. The law is usually stated as

h(x)=BeCxh(x) = B e^{C x}

where h(x)h(x) is, loosely speaking, the chance of death at age xx, and BB, CC are constants obtained from experience.

This essay reviews how Benjamin Gompertz discovered this expression. Our tour goes back in history, examines the original papers and data, steps into his shoes, and rederives things from scratch.

Click on the subheaders ▶ below to expand or collapse each section. If you are short on time and would rather know your chances of death right now, scroll down directly to the interactive charts and grab the estimates there.

The first tabulations of human longevity were published in late 1600s in Europe

Mortality has bewildered humankind for aeons, but it was only in the 18th century that mortality data by age was tabulated for the first time.

At the time, the so-called tables of mortality were built to calculate the price of life annuities.

(For those of you that, like me, don’t know what a life annuity is: it is an insurance product in which the purchaser pays in advance for a sequence of future payments while he or she is still alive.)

When an insurance company sells a life annuity to someone, it must come up with an expectation for how long the purchaser will live. Insurers are then very interested in predicting their customers’ lifespans as accurately as possible. And that’s why insurers spurred the creation of mortality tables.

Early mortality tables consisted in births and deaths grouped by age for some European localities during the 1600, 1700, and 1800s. Here is a famous one, published by Joshua Milne in 1815:

Milne, Joshua. 1815. A Treatise on the Valuation of Annuities and Assurances on Lives and Survivorships: On the Construction of Tables of Mortality and on the Probabilities and Expectations of Life, Volume 1, 405. London: Longman, Hurst, Rees, Orme, and Brown.

Data from several early tables has been nicely summarized by David Forfar. I plot four of them in the chart below.

For each following data points (x,y)(x, y), the yy-value represents the number of survivors aged xx in years. This kind of series is also known as survival curve. To make ours easier to compare, each dataset was rescaled (i.e., divided by the number of people in it) such that y(0)=1y(0) = 1 for all of them:

⚠️ Your screen is almost too narrow for the interactive chart below. Please rotate your device for a better experience.

Download chart data in .csv

A quick look at the chart reveals a few common features among these series:

  • A sharp decline between 00 and 55 years,
  • A roughly linear decay from 1010 to 5555 y.o., and
  • A somewhat accelerated decay afterwards

But beware of jumping into conclusions too fast!

These series were among the first ever tabulated, but they had serious issues. Their sample size, especially for older ages, was quite small. There were methodological and data collection issues as well.1

So the early data was rough. But, as soon as the first mortality tables were published, work related to them blossomed among European mathematicians. Besides its practical use in life annuities, the prospects of finding mathematical regularities on human mortality was — and still is — too alluring!

de Moivre investigated two basic mortality models in 1725

In 1725, the mathematician Abraham de Moivre published a book — Annuities Upon Lives — with his incursions into the problem of estimating human lifespans.2

In his book, de Moivre deliberately converted real lives into fictitious ones. By abstracting from real-world data, he was free to investigate models that were within reach of the Mathematics of his day. Fictitious lives are more amenable to simpler Mathematics!

He applied two mortality models to his fictitious populations. They had either:

  1. An equal number of deaths per year, or
  2. Equal probabilities of death at each year of age

The first hypothesis is known as de Moivre’s law. An equal number of deaths per year results in an arithmetic progression, and in survival curves following a line. From the previous chart, this simple model seems to fit parts of the early data. In fact, we have just observed a roughly linear decay (from 1010 to 5555 y.o.).

But you should also recall that the early data was inaccurate. So at the end of the day, his first model has not been very fruitful.

de Moivre’s second model did not reflect real human mortality either. We have later discovered though that it does describe several other phenomena found in nature. So it is useful to explore it further and sharpen our intuition about models.

What does “equal probabilities of death at each year of age” actually mean?

Here’s a way of thinking about it. This hypothesis is as if death were completely independent of age. In such a model, both the young and old die at the same rate. It does not matter if you are young (and usually healthy) or old (and typically frailty), your chances of dying in any giving day are always the same.

In such a hypothetical world, how does the human survival data look like?

They decrease at a constant and age-independent rate. Something very similar to the geometrical progression that happens to atoms in radioactive decay.

To make it visual, let’s consider two example curves of geometrical progressions, with constant rates ex30e^{- \frac {x} {30}} and ex50e^{- \frac {x} {50}}, and plot them over the data points that we have seen before:

⚠️ Your screen is almost too narrow for the interactive chart below. Please rotate your device for a better experience.

What to make out of this?

Well, de Moivre did invite Mathematics to the party. But it took a century before the data a bit more acceptable and could be used as a basis for better models.

Gompertz advanced foundational ideas in the 1820s

One hundred years after de Moivre’s work, Benjamin Gompertz made the founding contributions to the quest of a formula for mortality.

Gompertz was a British self-educated mathematician that worked for an insurance company. He published seminal work about a “law of human mortality”. The most famous of his papers came out in 1825. Because of those contributions, he is generally recognized as one of the pioneers of Actuarial science.3

His papers touch on different aspects of pricing life annuities. In a way, Gompertz intended to make his job easier. Because calculations were labor intensive at the time, he tried to find general formulae that could fit the data from early mortality tables well enough.

Did he succeed? Not really. Most likely because the data at hand was still of low quality in the 1820s.

Nonetheless, Gompertz ended up making two foundational contributions to the quest of modeling mortality mathematically:

  • a conceptual framework for the causes of death, and
  • a conjecture for how to account for the effects of aging in mortality

For the framework, here’s what he wrote in 1825:

It is possible that death may be the consequence of two generally co-existing causes; the one, chance, without previous disposition to death or deterioration; the other, a deterioration, or an increased inability to withstand destruction.

He didn’t elucidate what exactly he meant by “chance”, but he did put forward a working model for “deterioration”:

If mankind be con­tinually gaining seeds of indisposition, or in other words, an increased liability to death […] it would follow that the number of living out of a given number of persons at a given age, at equal successive increments of age, would decrease in a greater ratio than the geometrical progression.

In other words, deterioration would increase the susceptibility of humans to death and because of that, survival curves should decline faster at older ages. That rings as true to our present-day intuition — the old does seem to die more frequently than the young.

Gompertz went further and proposed a quantitative model to account for his conjecture of an increasing rate of death as we grow older:

If the average exhaustions of a man’s power to avoid death were such that at the end of equal infinitely small intervals of time, he lost equal portions of his remain­ing power to oppose destruction which he had at the com­mencement of those intervals, then at the age xx his power to avoid death, or the intensity of his mortality might be denoted by BexBe^{x}, BB being a constant.

It is fair to say that this is the paragraph that made Gompertz famous. So it surely bears closer examination!

In Gompertz’s model of deterioration, humans hold two opposing characteristics: the “power to avoid death” and its inverse, the “intensity of mortality.” That is, as one decreases, the other increases. Moreover, the “power to avoid death” would decay in “average exhaustions” over “infinitely small intervals of time.”

The way he mentions these terms suggests that Gompertz intended to cast his concept of “power to avoid death” into what is now known as a continuous random variable. Probability theory was yet to be fully formalized at his time, but he surely was familiar with Calculus and its uses to study rates of change. In fact, in his 1825 paper, Gompertz made extensive use of fluxionsIsaac Newton’s notation for the time derivative — to analyze the rate of change of survival curves.

Gompertz’s paragraph then ends with the punchline. He (seemingly out of nothing!) conjectured that intensity of mortality could be expressed by BexBe^{x}. Later in this essay, we will see that further progress showed that taking BexBe^{x} for the intensity of mortality works remarkably well. But where did Gompertz got this nice and tidy closed-form expression from?

Because he offered no explanation, it’s up to us to step into his shoes (and mind), fill in the gaps and rederive it. Let’s do it now!

Recreating the steps that Gompertz might have followed

Gompertz was into something with his analyses of rates of change of survival curves. But, as I have just remarked, there were gaps in his exposition. He did not provide a clear definition for what he meant by “intensity of mortality”. Neither did he justified why he picked BexBe^{x} for its analytic form.

If I were to fill the gaps and recreate step by step what Gompertz might have thought, I would reason as follows.

(I will think out aloud step by step. Anyone with familiarity of basic Calculus should follow easily, but bear in mind that I am no mathematician, so I won’t be 100% precise.)

First, let y(k)y(k) be the number of persons alive at age kk (similarly to the plots of survival curves that we have visited before).

To investigate the rate of change of yy, we start by computing the absolute decrease of y(k)y(k) over the 1010-year intervals:

Δy=y(k+10)y(k)\Delta y = y(k \mathclose + 10) - y(k)

To get a sense of the rate of change on a “more natural” timescale than 1010-year intervals, we then compute the average annual decrease over 1010-year intervals:

ΔyΔk=y(k+10)y(k)(k+10)k=y(k+10)y(k)10\frac {\Delta y} {\Delta k} = \frac {y(k \mathclose + 10) - y(k)} {(k \mathclose + 10) - k} = \frac {y(k \mathclose + 10) - y(k)} {10}

Having realized that 1010 years is too coarse of an interval (there certainly are tons of “action” happening in-between), we decide to compute the average over intervals of, say, 11 year:

ΔyΔk=y(k+1)y(k)(k+1)k=y(k+1)y(k)1\frac {\Delta y} {\Delta k} = \frac {y(k \mathclose + 1) - y(k)} {(k \mathclose + 1) - k} = \frac {y(k \mathclose + 1) - y(k)} {1}

We then recognize that it would be even better if we could generalize this average ratio from 11 year to any Δk>0\Delta k > 0. So we write:

ΔyΔk=y(k+Δk)y(k)(k+Δk)k=y(k+Δk)y(k)Δkeq. 1\tag*{eq. 1} \frac {\Delta y} {\Delta k} = \frac {y(k \mathclose + \Delta k) - y(k)} {(k \mathclose + \Delta k) - k} = \frac {y(k \mathclose + \Delta k) - y(k)} {\Delta k}

(To make it easier to refer to it later on, let’s call this expression rate of mortality.)

We now pause and think for a few moments about what rate of mortality actually means in plain English. (Do pause and think for yourself!)

We arrive at the following: it represents “the amount of deaths per unit of time.” Does it sound like something useful?

It surely does, but then we realize that this metric is not that great for studying human mortality over time. It still has a serious flaw.

As less and less people survive to older ages, we might be misled to think that there are less deaths per unit of time and therefore things are “improving” — while in fact the opposite is true. If there are only 1010 persons alive and all 1010 die in a year, things seem relatively much worse than if 1010 out of 10001000 persons alive die in a year.

So we reflect a bit more and realize that “the amount of deaths per unit of time considering only the ones that are still alive at the beginning of each unit of time” is a good fix for the flaw we have identified.

We then try to stress test this new expression. Could this metric have gone through Gompertz’s mind back in the day?

If we were Gompertz, working at an insurance company, we would be concerned about life annuities. But, for the point of view of an insurer, what are annuities if not: “Given the information that I have about a purchaser (like their age), how much longer should we expect them to live and thus demand to be paid?”

Looking at the problem of mortality through this metric sounds like a win-win. Not only it would fix the flaw we have identified, but it would fit greatly to the problem of life annuities because it “contains” the important information of how long the purchaser have already lived.

So we wonder about how to translate this new quantity from words to Mathematics. Shortly after, we notice that a division by y(k)y(k) seems to expresses well the conditioning of “considering only the ones that are still alive at the beginning of each unit of time”.

So, while there are persons alive (that is, y0y \not = 0) we have:

1yΔyΔk=1y(k)y(k+Δk)y(k)Δk\frac {1} {y} \cdot \frac {\Delta y} {\Delta k} = \frac {1} {y(k)} \cdot \frac {y(k \mathclose + \Delta k) - y(k)} {\Delta k}

Now, because by definition y(k+Δk)y(k){y(k \mathclose + \Delta k) \leq y(k)} for any survival curve, we suddenly become aware of the fact that (for any yy and kk) the above expression assumes only non-positive values.

So we decide (for convenience) to multiply it by (1)(-1) so that it only assumes either positive values or zero (which are more intuitive for most uses):

(1)1yΔyΔk=1y(k)y(k+Δk)y(k)Δk(-1) \cdot \frac {1} {y} \cdot \frac {\Delta y} {\Delta k} = - \frac {1} {y(k)} \cdot \frac {y(k \mathclose + \Delta k) - y(k)} {\Delta k}

Having reached a seemingly useful expression, we medidate on a name for it, and settle on discrete hazard rate h(k)h(k):

h(k)=1y(k)y(k+Δk)y(k)Δkeq. 2\tag*{eq. 2} h(k) = - \frac {1} {y(k)} \cdot \frac {y(k \mathclose + \Delta k) - y(k)} {\Delta k}

Delighted by this result we decide to go even further.

We have already noted that Gompertz mentioned “infinitely small intervals of time” in his conjecture. So we leave the discrete and jump into the realm of limits and differentials, Calculus after all.

Thus, in present-day notation:

h(t)=limΔt0(1y(t)y(t+Δt)y(t)Δt)h(t) = \lim_{\Delta t\to 0} {\Bigg( - \frac {1} {y(t)} \cdot \frac {y(t + \Delta t) - y(t)} {\Delta t} \Bigg)}

Or, to put it more succinctly,

h(t)=1ydydteq. 3\tag*{eq. 3} h(t) = - \frac {1} {y} \cdot \frac {dy} {dt}

And call it continuous hazard rate h(t)h(t).

Exploring if our hazard rate is the same as Gompertz’s intensity of mortality

I don’t know about you, but I am quite happy with the tidy results that we have just achieved.

There is still a remaining question, though. How can we tell if the hazard rate h(t)h(t) that we have found is the same concept as Gompertz’s “intensity of mortality”?

A way to investigate this is to calculate the hazard rate from the data points available to Gompertz, and then plot them to check if they resemble the exponential BexBe^{x} of his 1825 paper. If they do, that would be good evidence that he had our hazard rate in mind, and not something else.

So, from eq.2eq. \medspace 2 above, the discrete hazard rate h(k)h(k) is

h(k)=1y(k)y(k+Δk)y(k)Δkeq. 2\tag*{eq. 2} h(k) = - \frac {1} {y(k)} \cdot \frac {y(k \mathclose + \Delta k) - y(k)} {\Delta k}

Using it to calculate and plot the hazard rate found in early mortality tables:

⚠️ Your screen is almost too narrow for the interactive chart below. Please rotate your device for a better experience.

So? What do you think? Does it resemble an exponential?

Personally, I cannot fit an exponential with my eyes only.

To make our eyeballing a bit easier, let’s take advantage of the fact that any exponential curve becomes a straight line on logarithmic scale. Let’s switch the yy-axis to loglog scale. And now that we there, let’s also overlay an exponential function on top of the data points:

⚠️ Your screen is almost too narrow for the interactive chart below. Please rotate your device for a better experience.

Download chart data in .csv

Now we can see it! From 2525 to 7575 years it is visually plausible that the hazard rate from early mortality tables resembles an exponential, BexBe^{x}.

Of course this is still eyeballing (feel free to download the data and work out the details yourself), but it is now more believable that we have indeed recreated what Gompertz had in mind with intensity of mortality.

Incidentally, the name intensity of mortality did not stick. In fact, today that same function has multiple names depending on the field you come from. In Statistics, it is hazard rate (or hazard function). In Actuarial science, it is known as force of mortality. In Engineering, it is called failure rate. If anything, so many synonyms attest to the wide applicability of the concept.

Pioneers wrestled with data issues for decades throughout the 19th century

The pioneers continued to wrestle with mortality data for several decades over the 19th century. The inadequacy and poor quality of datasets led not only to dead ends in analyses (garbage in, garbage out), but also to significant financial losses to insurers.

Case in point, the losses the British government took after the Life Annuity Act of 1808. The Act, passed in the midst of the Napoleonic Wars, aimed to help funding Britain’s war efforts. but it ended up being a big mistake.

Casey Rothschild describes the colorful sequence of events in a 2009 paper:

Prior to the Life Annuity Act of 1808, British government debt consisted almost exclusively of Consols — coupon bonds with infinite maturity. The explicit goal of the Act was to replace them with finite-lived debt by allowing individuals to exchange Consols for life annuities. Since Consols were tradable assets, the act effectively opened a life-annuity market.

Annuities sold under this act made twice-yearly tax-exempt payments. The size of these payments depended on the interest rate (the market Consol price) and the age of the annuitant (the nominees). Prices were designed to be actuarially fair; to that end, they were priced to be 2% more expensive than the actuarially fair price implied by the Northampton life table.

Shortly after passage of the Act, there appears to have been a recognition that the use of the Northampton tables was leading to large government losses. Ray D. Murphy writes [in Sale of Annuities by Governments] that it “was wholly unsuitable as a measure of the lower rates of mortality experienced by a self-selected group of annuitants. It was not long before this shortcoming was brought to the attention of the Exchequer.”

You, astute reader, will remember that we have already seen data from the Northampton life table before. (In any case, we will revisit it in a moment.)

Back to Rothschild to learn how the story ended:

In 1823, Parliament finally took active steps to address this perceived mispricing by commissioning John Finlaison to study the mortality experience of the early annuitants. His 1829 report developed a new set of gender-specific life tables based on the observed mortality experience of these nominees. After some debate and a brief suspension of the life annuity program, Parliament determined to resume it with gender-specific pricing based on these new tables.4

In effect, here is a visual comparison between Finlaison’s data on Government Annuitants and Richard Price’s Northampton table:

⚠️ Your screen is almost too narrow for the interactive chart below. Please rotate your device for a better experience.

As Craig Turnbull writes in his A History of British Actuarial Thought:

Finlaison’s analysis evidenced the profound differences between the government annuity mortality experience and the mortality rates assumed in the Northampton table: for example, at age 60, the Northampton table was implying a mortality rate that was fully double that experienced by the government annuitants!

Beyond the impact it had on government policy, Finlaison’s report was an important milestone for actuarial thought. [… T]he scale and rigour of Finlaison’s analysis was of a different order to these earlier works.

Fast forward to the 2010s. Although there are still limitations in mortality data for ages above 100100, we have access to data of much higher quality than the pioneers ever had.

The Gompertzian exponential does fit present-day mortality remarkably well. The fit encompasses an age period of approximately 7070 years and a range of hazard rates covering four orders of magnitude — from 0.01%0.01\% to 10%10\%.

Recall that the Gompertz hazard function is

h(x)=BeCxh(x) = B e^{C x}

Here are the constants BB and CC, and age periods (xx values) that fit the Gompertzian function to recent mortality data from the U.S., England & Wales, and Brazil:

Country Sex Age period B\mathit{B} C\mathit{C}
USA M 2525-100100 y.o. 0.00012400.0001240 0.077590.07759
USA F 2525-100100 y.o. 0.00004710.0000471 0.085880.08588
E&W M 2525-100100 y.o. 0.00003400.0000340 0.093480.09348
E&W F 2525-100100 y.o. 0.00001540.0000154 0.099470.09947
Brazil M 2525-8080 y.o. 0.00027980.0002798 0.065620.06562
Brazil F 2525-8080 y.o. 0.00006210.0000621 0.081050.08105

You may ask: Where did this table come from? Or, more specifically, how did I find the exponentials that fit the data?

I ran a simple linear regression over recent data provided by the national bureaus of statistics (or equivalent) of these three countries.5

To convince you that these Gompertzian curves do fit quite well the best data that we have available, let’s plot the official data points and overlay the fitted lines. For males we have the following:

⚠️ Your screen is almost too narrow for the interactive chart below. Please rotate your device for a better experience.

Download chart data in .csv

So, suppose you are a 4040-year old English man.

The “official” data indicates that your chances of dying at this age are 0.1474%0.15%0.1474\% \approx 0.15\%. That is, according to death statistics in 2010-2012 in England and Wales, roughly 1515 in every 10,00010\text{,}000 English and Welsh men aged 4040 died at that age.

The Gompertz law that we fitted to the data predicts that

0.0000340e0.0934840=0.0014300.0000340e^{0.09348 * 40} = 0.001430

which is off to the “actual” number by

0.1474%0.1430%0.1474%3%\frac {0.1474\%-0.1430\%} {0.1474\%} \approx 3\%

I don’t know about you, but getting a 3%3\% error from a quick and simple 2-parameter model seems great to me!

Here is the equivalent chart for females:

⚠️ Your screen is almost too narrow for the interactive chart below. Please rotate your device for a better experience.

Download chart data in .csv

If you are a 2626-year old American woman, your probability of dying before your 2727th birthday is 0.00064560.065%0.0006456 \approx 0.065\%. In other words, according to U.S. death statistics in 2017, approximately 66 in every 100,000100\text{,}000 American women aged 2626 died.

The prediction of our Gompertzian curve is

0.0000471e0.0858826=0.00043920.0000471e^{0.08588 * 26} = 0.0004392

And the error is

0.06456%0.04392%0.06456%32%\frac {0.06456\%-0.04392\%} {0.06456\%} \approx 32\%

Why is the error much higher for this second example?

The explanation is clear. You can spot in both charts that the fit is much worse near the start and the end of the age periods — i.e., around 2525-3535 y.o. and above the age of 9595.

I will explore these two edge cases in future essays. There is a lot of very intriguing stuff going on both edges!

For instance, youngsters are particularly affected by what is known as “extrinsic” causes of mortality. These are infections, accidents and crime — things that come from “outside”. In fact, if we remove the deaths from extrinsic causes and leave just the ones due to “intrinsic” issues, it seems that the fit to the Gompertz law expands to younger ages very nicely!

As for the mortality of the eldest, due to sparse data,6 it is still not entirely clear to this day if the hazard rate flattens out — or perhaps it even decreases. (I don’t have a strong opinion on the matter yet. I still have several papers on the topic to read and think about.)

I hope you will join me in future essays on these and related topics.

Finally, if you haven’t yet, feel free to scroll up, find you age and eyeball your chances of dying this year. You may not be from any of these three countries. In this case, of course the estimate will be rougher but probably still in the correct order of magnitude.

Further reading

Appendix A — The famous modification proposed by Makeham in the 1860s

In the decades following Gompertz’s original papers, it was clear for everyone involved in Actuarial science that the Gompertzian exponential did not fit the data over the entire human lifespan. Despite that, the lure of finding the actual law of mortality continued to motivate the proposal of several other expressions.

In a 1953 paper, M. E. Ogborn listed some of the attempts that were published in the late 1800s:

Gompertz, himself, suggested in 1860 a formula based on an amalgamation of several of his curves with different constants, and various combinations have been suggested from time to time,7 together with others of a different type.

Perhaps the best illustration of the point of view is given by the mathematical formula proposed in 1871 by the Danish mathematician, Thorvald Thiele, to express the rate of mortality throughout the whole of life.

In Thiele’s [7-parameter] formula,

h(t)=AeBx+CeD(xE)2+FGxh(t) = A e^{-Bx} + C e^{-D(x-E)^2} + F G^x

the last term, FGxF G^x, is a Gompertz curve to represent old-age mortality and the first, AeBxA e^{-Bx}, a decreasing Gompertz curve to represent the mortality of infancy. The middle term, CeD(xE)2C e^{-D(x-E)^2}, is a form of the normal curve of error [likely to account for the hump in mortality experienced by young adults].

The contribution that ended up being the most famous was proposed by William Makeham in the 1860s. He wrote in a 1866 paper:

Mr. Gompertz’s theory of the law of mortality is, that the vital power, or the “power to oppose destruction,” loses equal proportions in equal times; and consequently that the intensity of mortality, which is inversely proportional to this power, is represented by a series in geometrical progression.

It has, however, invariably been found that the ratio of progression, instead of remaining constant throughout the whole period of life, as the theory supposes, is, on the contrary, subject to a slow but continued increase with age; in consequence of which it has been found necessary to change the constants at least once, but generally twice, in the construction of a complete table of mortality.

I venture to think, in a more scientific manner, by supposing the intensity of mortality to be represented by a series not purely geometrical, but consisting of the sum of two terms, the one a constant quantity and the other geometrical. That is, instead of representing the intensity of mortality by an expression of the form BexBe^x we represent it by one of the form A+BexA + Be^x.

He then offered the following rationale for his proposal:

From the age of 15, or thereabouts, the normal law of mortality, of which we are in search, is characterized by an increasing progression throughout; the rate of increase, however, being at first very slow, and gradually gaining in rapidity with increased age.

It is this characteristic which renders the formula before described (consisting of a constant combined with an increasing geometrical series) singularly well adapted to represent the law in question from adolescence to extreme old age — a satisfactory proof of which assertion I hope to give on a future occasion, when I propose also to examine the results of an extension of the formula to all periods of life.

Makeham’s “satisfactory proof” on “a future occasion” came out the next year. In 1867 he published another paper expanding on his hypothesis.

Makeham used data from five U.K. mortality series from the 1800s to support his argument. To check it out visually, let’s plot the data points from all five data series as presented by him:

⚠️ Your screen is almost too narrow for the interactive chart below. Please rotate your device for a better experience.

Now, let’s overlay two curves on top of his data points: a Gompertz’s exponential (in light gray) and the Makeham’s proposed modification (dark gray).

From the chart below, it is pretty clear that Makeham’s modification did improve the fit to the data available at his time:

⚠️ Your screen is almost too narrow for the interactive chart below. Please rotate your device for a better experience.

Download chart data in .csv

Part of the appeal of Makeham’s modification is that it adds just one additional parameter to the expression (for a total of three). From a modeler standpoint, too many parameters isn’t good — given enough parameters, any data set can be fitted.8

As much as it was an improvement, the Gompertz-Makeham law is not the ultimate law of mortality. Makeham avoided infancy and childhood altogether in his 1867 paper. In fact, he wrote at the time:

I must postpone to a future opportunity the examination of so interesting and important a subject as the mortality of infancy and childhood.

Appendix B — Survival function y(t)y(t) in terms of hazard rate h(t)h(t)

We have already have h(t)h(t) in terms of y(t)y(t) from eq.3{eq. \medspace 3} (see below). Let’s work out a few more steps to express y(t)y(t) in terms of h(t)h(t), and come full circle.

Recall that

h(t)=1ydydteq. 3\tag*{eq. 3} h(t) = - \frac {1} {y} \cdot \frac {dy} {dt}

By the Fundamental theorem of Calculus, we can write

h(t)dt=(1ydydtdt)+c\int {h(t) \cdot dt} = - \int {\Bigg( {\frac {1} {y} \cdot \frac {dy} {dt} \cdot dt} \Bigg)} + c

When thinking about how to solve the integral at right-hand side, we recollect that the natural logarithm somehow resembles it.

Moreover, because we defined y>0y > 0, we can summon the Chain rule as:

ddtlny=ddylnydydt=1ydydt\frac {d} {dt} {\ln {y}} = \frac {d} {dy} {\ln {y}} \cdot \frac {dy} {dt} = \frac {1} {y} \cdot \frac {dy} {dt}

And insert this result into the previous expression

h(t)dt=(1ydydtdt)+c=(ddtlnydt)+c\int {h(t) \cdot dt} = - \int {\Bigg( {\frac {1} {y} \cdot \frac {dy} {dt} \cdot dt} \Bigg)} + c = - \int {\Bigg( { \frac {d} {dt} {\ln {y}} \cdot dt} \Bigg)} + c

By the Fundamental theorem of Calculus again, we have

(ddtlnydt)=lny+c\int {\Bigg( { \frac {d} {dt} {\ln {y}} \cdot dt} \Bigg)} = \ln {y} + c


h(t)dt=lny+c\int {h(t) \cdot dt} = - \ln {y} + c

Finally, by the definition of logarithms

y(t)=eh(t)dt+ceq. 4\tag*{eq. 4} y(t) = e^{- \int {h(t) \cdot dt}} + c

There it is, y(t)y(t) expressed in terms of h(t)h(t).

This is especially useful for when the hazard rate is known and we intend to find its corresponding survival curve. That is exactly what we will do in a future interactive essay (soon!).

  1. One can read commentary about issues with datas in Makeham 1860, Makeham 1866, Makeham 1867, and Sutton 1884

  2. de Moivre based his calculations on a table assembled by Edmond Halley — the astronomer of comet’s fame. Halley used mortality data from Wrocław, Poland, and published his table in a 1693 paper, An Estimate of the Degrees of the Mortality of Mankind

  3. A good, short, readable paper with the historical context of early Actuarial science is Ogborn 1953

  4. I couldn’t get my hands at the full Finlaison tables. They were originally published in 1829 as a Report of John Finlaison, Actuary of the National Debt, on the Evidence and Elementary Facts on which the Tables of Life Annuities are Founded: Ordered by the House of Commons, to be Printed 31 March 1829. The 10-volume set History of Actuarial Science, edited by Steven Haberman and Trevor A. Sibbett and published in 1995, also has the tables (and much more). The set is the historical vade mecum of the field, your local library might have it. Look for the 1829 Finlaison tables in Volume 2

  5. Our data sources are:

  6. For instance, in the paper about ELT 17 Methodology (PDF) the authors write: “Graduation (smoothing) is particularly important at the oldest ages, where exposure numbers are small and data are sparse.” 

  7. Another example is Thomas Rowe Edmonds’s 1832 book Life Tables, Founded Upon the Discovery of a Numerical Law Regulating the Existence of Every Human Being, Illustrated by a New Theory of the Causes Producing Health and Longevity. Edmonds proposed the use of three exponential curves to account for the hazard rate throughout all ages. Each exponential curve corresponded to one of the “three grand divisions of life”: Infancy (from birth to 8 years), Manhood (from 12 to 55 years), and Old Age (from 55 to end of life). Incidentally, it was Edmonds who first coined the name “force of mortality”, which is used in Actuarial science to this day and replaced Gompertz’s “intensity of mortality”. 

  8. John von Neumann is often quoted having said: “With four parameters I can fit an elephant, and with five I can make him wiggle his trunk.” For an actual elephant-fitting function, check out this cool 2010 paper, Drawing an elephant with four complex parameters