In this entry, as part of the series on dynamic systems modeling with R and simecol, we'll take a look at population growth models, our main focus being on human population growth models and how they tie into other theoretical frameworks such as demographic transition theory and carrying capacity. These different models are then fitted to data on world population spanning a period between 10000 BC up to 2015, using R and simecol. In this latter part we will finally introduce the FME R package.
Malthus model of population growth
In 1798 the Reverend Thomas Malthus published An Essay on the Principle of Population, sparking a debate about population size and its effects on the wealth of nations. His Essay is framed within the wider context of considerations about the perfectibility of the human condition and it was Malthus' response to the likes of Condorcet and others who argued, in the typical positivist style of the era, that human reason would eventually surmount all obstacles in the way of mankind's progress.
At the core of the Malthusian argument was the observation that populations grow exponentially but the means of physical sustenance can only increase in arithmetic progression at best, and since population growth is commensurate to the availability of means of sustenance, any momentary improvement in the living condition of the human masses that would lift the restrictions on human population growth would inevitably result in long term famine and aggravation of man's plight. In modern day system dynamics terminology, we would say that population growth has a negative feedback mechanism that prevents further population growth and this idea in itself is not without justification. I mean, the planet cannot sustain exponential human population growth indefinitely, can it?
Malthus was in fact aware of negative feedback mechanisms and cyclical behavior in population growth. To paraphrase Malthus: the increase in population meant a decrease in wages and at the same time, an increase in food prices (less food for more population). During this time, the difficulties in rearing a family are so great that population growth is at a stand. But at the same time, a decrease in the price of labor meant that farmers could employ more laborers to work the land and increase food production thus loosening population growth restraints. The same retrograde and progressive movements in human wellbeing ocurred again and again a cycle of human misery and suffering^{1}. Being a clergyman, this situation was seen by Malthus as "divinely imposed to teach virtuous behavior"^{2}.
At any rate, what is known today as the Malthus model for population growth, given in (1), only considers the exponential aspect of growth, as given by the rate parameter \(r\). No doubt the Malthus model is the simplest population growth model. It characterizes the growth of a population as dependent only upon the reproductive rate \(r\) and the population level at any given time (\(N\)). Note that the negative feedback controls we spoke of earlier are conspicuously missing in this model. In this model, there is only positive feedback: higher population levels beget even higher population levels, and so on to infinity.
\[\frac{dN}{dt}=rN\qquad\qquad(1)\]
Verhulst or Logistic Model of Population Growth
Considering the ideas on population growth that Malthus expounded in his Essay, Adolphe Quetelet judged that Malthus failed to provide a mathematical explanation as to how population levels, whose growth was by nature exponential, would top off at some point and not continue to rise to infinity^{3}. Quetelet himself and his pupil, Pierre Verhulst, would assume this task in their effort to apply the methods of physics, which had seen a huge success with Newton's work, to the social sciences, as was characteristic of Positivist thought in the nineteenth century^{4}.
Making an analogy with physics, Quetelet argued that in addition to the principle of geometric growth of the population, there was another principle at work according to which "the resistance or the sum of the obstacles encountered in the growth of a population is proportional to the square of the speed at which the population level is increasing"^{5}. This premise bears semblance to the mathematical description of an object falling through a dense fluid: the deeper the object descends into this fluid, the greater the density, and hence the resistance to its downward trajectory, until the object reaches a depth beyond which it can't descend further^{6}. By applying this analogy to demographic growth, we can infer that there is a maximum population level. As a result, a population finds the cause of its eventual equilibrium in its own growth^{7}. In modern literature, this model of population growth is given by the following differential equation:
\[\frac{dN}{dt}=r_{max}N\left(1\frac{N}{K}\right)\qquad\qquad(2)\]
Let us examine this equation in more detail to understand its behavior. For small population levels, when \(N\) is much smaller than \(K\), \(N\) exhibits exponential like growth, as dictated by the \(r_{max}\) parameter, also known as the intrinsic growth rate. This is because the \(\left(1N/K\right)\) term is still relatively close to 1. However, as \(N\) becomes larger, this last term starts to matter because it approaches zero. As \(N(t)\) attains this point, the growth given by the \(dN/dt\) differential peters out and the population levels top off asymptotically at \(K\). This assymptotical level \(K\) is known as the saturation point or the carrying capacity of the population. We can integrate equation (2) to obtain \(N(t)\), whose graph is as follows:
Functions whose curve has this characteristic Sshape like the Verhulst model of population growth are called sigmoid functions^{8}. We encounter sigmoid functions in many contexts: as the cumulative probability distribution function of the normal and TStudent distributions, as the hyperbolic tangent function and as the activation function of neurons in mathematical neural network models. Then of course there's also the logistic family of functions, which occur in statistics and machine learning contexts as well as in chemistry, physics, linguistics and sociology (in the latter two contexts the logistic function is used to model language or social innovation/adoption processes). Hence the alternate denomination for the Verhulst model of population growth as the logistic model of population growth. Another feature of this model is that it has an inflection point  after the \(N\) reaches \(K/2\), the rate of population growth starts to decelerate until it eventually levels off to zero. At this halfway inflection point, the differential rate of growth is highest and curve has its "steepest" tangent line. Notice the symmetry of the curve around this inflection point.
The beauty of the logistic model of population growth lies in its simplicity (only two parameters) and the interpretability of its parameters. The intrinsic growth rate (parameter \(r_{max}\)) is the rate of exponential growth when the population is small and the carrying capacity parameter \(K\) is simply the maximum population level attainable. One important disadvantage of the logistic model is the fact that its inflection point is exactly half the carrying capacity \(K\), which severly limits the applicability of this model. Nonetheless, Verhulst himself used this model to fit census data in France (18171831), Belgium (18151833), Essex County (18111831) and Russia (17961827), all with relative success^{9}.
Variants of the logistic population growth model
In seeking to improve the applicability of the basic logistic model for population growth, many authors have since proposed models with more parameters that still retain the basic sigmoid features of the logistic model and include one inflection point. Given the inflexibility of the basic logistic growth model about the inflection point, Blumberg introduced what he called the hyperlogistic function whose derivative is given by (3), which we will in turn call the Bloomberg model of population growth:
\[\frac{dN}{dt}=r_{max}N^\alpha\left(1\frac{N}{K}\right)^\gamma\qquad\qquad(3)\]
The Blumberg model introduced two additional parameters to surmount problems raised by treating the intrinsic growth rate as a timedependent polynomial^{10}, while at the same time contributing some measure of pliancy to the inflection point, given by (4). I would add that equation (3) cannot always be integrated  Blumberg cataloged the various conditions on \(\alpha\) and \(\gamma\) that result in analytic solutions to (3) when explicit integration can be carried out.
\[N_{inf}=\frac{\alpha}{\alpha+\gamma}K\qquad\qquad(4)\]
The Richards growth model, originally developed to fit empirical plant mass data, is given by:
\[\frac{dN}{dt}=r_{max}N\left(1\left(\frac{N}{K}\right)^\beta\right)\qquad\qquad(5)\]
\[N_{inf}=\left(\frac{1}{1+\beta}\right)^{\tfrac{1}{\beta}}K\qquad\qquad(6)\]
The inflection point in the Richards model is given by (6). Note that for \(\beta=1\), the Richards model is the original logistic growth model with the same inflection point. For extreme cases of \(\beta\), we have \(N_{inf}=e^{1}K\) when \(\beta\rightarrow 0\) and \(N_{inf}=K\) when \(\beta\rightarrow \infty\).
A further (and logical) generalization of the Blumberg and Richards models leads us to the socalled generalized logistic growth function with five parameters^{11} (equation 7), where \(\alpha\), \(\beta\) and \(\gamma\) are all real positive numbers. The inflection point for this model is given by (8), which makes sense as long as \(N_0 < N_{inf}\). If \(N_0 > N_{inf}\), then \(N_{inf}\) will never be attained, because the intrinsic growth rate is assumed to be positive.
\[\frac{dN}{dt}=r_{max}N^\alpha\left(1\left(\frac{N}{K}\right)^\beta\right)^\gamma\qquad\qquad(7)\]
\[N_{inf}=\left(1+\frac{\beta\gamma}{\alpha}\right)^{\tfrac{1}{\beta}}K\qquad\qquad(8)\]
Yet another growth model similar to the logistic growth model is the Gompertz growth model^{12}. The Gompertz model has been used to describe exponential decay in mortality rates, the return on financial investments^{13} and tumor growth. It's equation takes the following form:
\[\frac{dN}{dt}=r_{max}N\,\log\left(\frac{N}{K}\right)\qquad\qquad(9)\]
An interesting variant for this model's representation is to use two stock variables, one representing the mass or population that's growing (\(N\)) and the other representing the diminish rate or growth (\(r\)). So, we would have two differential equations^{14}:
\(\frac{dr}{dt}=kr\)  (10) 
\(\frac{dN}{dt}=rN\) 
It must be noted that all the models above imply exponential growth from the first time instant, where in fact, the effective exponential rate of growth is highest, as the population or mass levels are still far from attaining the carrying capacity. In some situations, it would seem that some populations begin in a "dormant" stage, where they are not exhibiting this sort of exponential growth. Such may be the case, for example, in societies before the industrial revolution or even before the agricultural revolution, when scarcity was the order of the day and high birth rates were decompensated with high mortality rates, resulting in population numbers that grew ever so slowly.
While going over the literature in writing this post, I found an interesting twostage population growth model in Petzoldt (2008). He describes the situation in which a microorganism culture is dormant while it is adapting to its environment, after which it begins to reproduce and increase in numbers according to the logistic growth mechanism outlined above. In other words, in this model, growth is delayed or lagged. Petzoldt's two stage model for population growth is in fact a twocompartment model (see my entry on epidemological models, where I discuss compartment models), with one compartment representing the part of the population in the "dormant" stage and the other representing the "active" population:
\[\frac{dN_d}{dt}=r_1N_d\]  (11) 
\[\frac{dN_a}{dt}=r_1N_d+r_2N_a\left(1\frac{N_a}{K}\right)\] 
Hyperbolic models of population growth
Some authors have pointed out that logistic population growth models, although fairly accurate for shortterm growth of populations on a scale of centuries, fail to accurately describe the longterm population growth of the entire human race on a scale of thousands of years. Namely, the logistic models do not account for the fact that the world's population has grown from 2 billion to 7 billion in the last 50 years alone, in comparison to which the population growth curve seems almost flat for time periods in the remote past. In a seminal work by Von Foerster, Mora and Amiot^{15}, they proposed that according to the available data by 1960, the world's population growth could be described by the the following model:
\[\frac{dN}{dt}=\frac{C}{(T_ct)^2}\qquad\qquad(12)\]
This model has one essential flaw: as \(t\) approaches \(T_c\), the denominator approaches zero and the population shoots up towards infinity. Even when modeling the world's population, this model predicts an ever increasing population rate when in fact, in 1962, the world population growth rate \(N\cdot dN/dt\) peaked at 2.1% and has since been decreasing^{16}. To avoid the singularity predicted by Von Foerster et al's model, the Russian physicist S.P. Kapitsa suggested the following approach:
\[\frac{dN}{dt}=\frac{C}{(T_ct)^2 + \tau^2}\qquad\qquad(13)\]
The differential equation in (13) can be easily integrated, resulting in:
\[N(t)=\frac{C}{\tau}arccot\left(\frac{T_ct}{\tau}\right)\qquad\qquad(14)\]
With (14), Kapitsa fitted his model to the world population data available at that time and found that \(C=(186\pm 1)\cdot 10^9\), \(T_c=2007\pm 1\), and \(\tau=42\pm1\). The parameter \(\tau\) represents, according to Kapitsa, the average lifespan of a human generation.
Models with variable carrying capacity
The logistic models covered so far implicitly assume that the carrying capacity is a constant, presumably dependent on the environment's capacity to sustain a species population. While this might be plausible with different populations of plants and animal species, the human species sets itself apart from the others in one important aspect: its unprecedented and unmatched capacity to significantly alter the environment (and indeed the planet) to suit its purposes.
As a first approach, we could consider that the carrying capacity of the earth is gradually expanded by a growing human population, an idea grounded on the observation that a greater number of people imply greater food productivity and eventually, a greater number of inventions to amplify the productivity of one individual. We would have the following model for population growth:
\[\frac{dN}{dt}=rN\left(1\frac{N}{K}\right)\]  (15) 
\[\frac{dK}{dt}=\gamma K N\] 
In the first equation of (15) we can easily spot our old friend, the Verhulst logistic model. This time however, as the second equation of (15) implies a growing capacity always one step ahead of the population, it doesn't take much to convince ourselves that this model implies a runaway an unbounded population. In fact, it can be shown that prior to the critical time point \(t_c\), this model behaves very much like the hyperbolic model described by equation (12)^{17}. Still, we intuitively know that there must be some physical limits for the earth's population and that human technological innovation cannot push the carrying capacity indefinitely towards infinity. For example, we have already pointed out that since 1962, the relative population growth rate \(1/N\cdot dN/dt\) has started to gradually to decrease. This occurred in the same time as the socalled Green Revolution, which brought about significant increases in food productivity and crop yields^{18}. However, the Green Revolution did not bring about a surge in the relative population growth rate and quite the opposite occurred  the world population's growth seems to be slowing down.
At any rate, there is no reason to suppose that the carrying capacity remains constant throughout human history, and surely, there must be positive and negative influences on the carrying capacity. On the one hand, the idea that each person contributes to the growth in human carrying capacity is not to be discarded entirely. On the other hand, the contribution of each additional person depends on the resources available, and these resources must be shared among more people as the population increases. What is being described here can be summarized in the following populationgrowth model proposed by Cohen^{19}, which he called the CondorcetMill model, after the British philosopher John Stuart Mill (18061873) who is cited as foreseeing that a stationary population is both inevitable and desirable.
\[\frac{dN}{dt}=rN\left(1\frac{N}{K}\right)\]  (16) 
\[\frac{dK}{dt}=\frac{L}{N}\frac{dN}{dt}\] 
Introducing the FME package and some comments on the world population data we'll be using
We have finally come to the part of this blog post in which we will try to fit the world population data to the various population growth models covered. The data was obtained fro the ourworldindata.org site^{20} and cosists of a tabular csv file with two columns, which i have renamed "time" (the year) and "pop" (population). This data covers a wide timespan, from 10,000 BC until 2015 and to be perfectly honest, I cannot vouch for the validity of the population counts in the remote past nor do I have any idea as to how they obtained census data for this time period. I'll just assume that the population data for the prehistoric and ancient time periods is fairly accurate. At any rate, this large timescale population data allows us to compare the family of models based on logistic population growth to the singular approach by Kapitsa.
For the model fitting process, we will use an Rpackage called FME (FME stands for Flexible Modelling Environment). FME introduces added functionalities over the simecol package to aid in the process of system dynamics model building:
 FME's modFit incorporates additional nonlinear optimization methods for parameter fitting over simecol's fitOdeModel: "Marq" for the LevenbergMarquardt algorithm, which is a least squares method, "Pseudo" for the pseudorandom search algorithm, "Newton" for a Newtontype algorithm, and "bobyqua" for a derivativefree optimization using quadratic approximations.
 Global sensitivity analysis, which consists of assessing the change of certain model output variables according to changes in parameters over a large area of the parameter space, is done in FME via the sensRange function.
 The sensFunc function in the FME package provides a way to define sensitivity functions in order to perform a Local sensitivity analysis, which consists of studying the effects of one parameter when it varies over a very small neighboring region near its nominal value.
 The sensitivity functions are also used by FME to estimate the collinearity index of all possible subsets of the parameters. This has to do with the identifiability of the parameters in the model, a concept well worth delving into.
The nonlinear optimization methods used to fit ODE models like the ones I have been dealing with in this series of posts are very different from methods like the ordinary least squares used in linear regression, for example. The ordinary least squares method, on account of being linear on the model parameters, will always come up with a unique estimation of these parameters. The least squares estimation in linear regression is unique because there's only one global minimum in the parameter search space. In the case of nonlinear optimization problems like the ones that we deal with in ODE models, however, the parameter search space is much more complex and there may be many many local minima. This situation substantially complicates ODE model fitting, particularly when the model has several parameters.
When one is fitting ODE models, different initial "guesses" for the parameter values will result in entirely different model parameters and indeed, in an entirely behavior of the model's variables. To determine just how much do small changes in the initial parameter values affect the model's variables is the reason why we perform sensitivity analysis on the models. It may occur that some parameters are linearly or almost linearly dependent on others, and just like in the linear regression case, this multicollinearity negatively affects the identifiability of the model's parameters. In particular, the colinearity index given by the FME collin function gives a measure of the parameter set's identifiability based on the data we use to fit the model.
The collin function provides the collinearity index for all possible subsets of a model's parameters. The larger the collinearity index for any given set of parameters, the less identifiable those parameters are. As a rule of thumb, a parameter set with a collinearity index less than 20 is identifiable^{21}.
Like sands through the hourglass, so are the days of our lives
We now begin the process of fitting various population models to the world population data described above. We will begin with the classic (Verhulst) Logistic growth model with 2 parameters:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 

The model's output  parameter estimations, sum of squares measure of goodness of fit as calculated in line 28 of the above code, and the collinearity index of the parameter set (produced by lines 3031 of the code)  is given below, along with the plot for the real population curve (in red) and the fitted model (in cyan). I have not included the source code for generating these plots, but could do so if any reader is interested.
r K
6.065198e04 9.999754e+11
SC : 92.33517
r K N collinearity
1 1 1 2 7.9
Let's interpret the results above. First of all, with a collinearity index of 7.9, the model's parameters are identifiable, so that's not an issue. An inspection of the r and K parameters reveal that the Verhulst Logistic growth model predicts a maximum world population of almost \(1\cdot 10^{12}\) or one trillion inhabitants. Considering we're currently only at 8 billion inhabitans, we would have a long way to go before we reach that saturation point. One look at the graph reveals that this model is simply not adequate. The model's curve seems to bulge too much above the real population curve, indicating that this model predicts too much population growth at remote times in the past when we know that the real growth was barely noticeable. All the while, the model predicts a population of 34 billion in 2015 when we know the real population level to be over 7.3 bilion this year. So we have two major flaws with this model: 1) it predicts the modern, explosive, population growth to begin much sooner than it really did and 2) in modern times (1800present), the model's predicted growth is simply not fast enough compared to the real data. These flaws are confirmed by the model's high sum of squares value of 92.34.
One of the limitations of the classic logistic growth model is the inability to "tweak" the inflection point. In this case, with the model's K value estimated at 1 trillion, the inflection point would be half of that, or 500 billion  too far into the future. We know for a fact that the increase in the world population growth rate started to slow down during the 1960's, so what we would need is a logistic growth model in which the inflection point can be closer to the maximum population levels. We will next attempt to fit the generalized logistic growth model, which would, in theory, afford us the flexibility in tweaking with the inflection point.
The problem with the generalized logistic growth model is the large number of parameters (5) with just one differential equation. This results in a very complex parameter space with potential problems in parameter identifiability. To explore this situation, I decided to sample one hundred initial parameter guess values (N=100) using the uniform probability distribution with a given range for each parameter and storing these initial values in a dataframe (see code snippet below), from where I proceeded to use them as initial values for the model fitting procedure.
1 2 3 4 5 

I must comment on the reason why I sampled only integer values for the gamma (\(\gamma\)) parameter. Taking a look at model equation (7), we see that if the expression within parentheses on the right hand side that's raised to the gamma exponent ever becomes negative, this would raise numerical error issues because we'd be raising a negative base to a fractional exponent. While there are possibly ways around this issue, it was simpler for me to just use integer values for gamma. Now when fitting the generalized logitic model using these 100 different initial parameter sets, I obtained different parameter estimations with different sum of squares goodness of fit values, which I wrote to a csv file with the following code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 

Choosing the parameter set with the least sumofsquares (SC) value, I ran the simulation of the model and obtained the following graph and the following collinearity indexes:
r K alpha beta gamma N_inf SC
1.836653e05 99966063867 1.19538 1.996857 1.009536 60943709007 76.60192
r K alpha beta gamma N collinearity
1 1 1 0 0 0 2 5.8
2 1 0 1 0 0 2 214.0
3 1 0 0 1 0 2 6.2
4 1 0 0 0 1 2 5.8
5 0 1 1 0 0 2 6.0
6 0 1 0 1 0 2 81.3
7 0 1 0 0 1 2 6442.9
8 0 0 1 1 0 2 6.3
9 0 0 1 0 1 2 5.9
10 0 0 0 1 1 2 81.0
11 1 1 1 0 0 3 403.1
12 1 1 0 1 0 3 131.9
13 1 1 0 0 1 3 9122.3
14 1 0 1 1 0 3 425.7
15 1 0 1 0 1 3 403.4
16 1 0 0 1 1 3 133.0
17 0 1 1 1 0 3 133.7
18 0 1 1 0 1 3 9018.9
19 0 1 0 1 1 3 6751.7
20 0 0 1 1 1 3 134.9
21 1 1 1 1 0 4 608.5
22 1 1 1 0 1 4 14594.6
23 1 1 0 1 1 4 11876.4
24 1 0 1 1 1 4 604.7
25 0 1 1 1 1 4 11704.0
26 1 1 1 1 1 5 14777.0