Universidad Nacional Abierta Matemáticas El Tigre

Mostrando entradas con la etiqueta en. Mostrar todas las entradas

martes, 6 de septiembre de 2016

R Puzzle No. 1 - Five Sailors, a Monkey and some Coconuts

In a brief diversion from the simecol dynamic system models, which I promise to resume in the future, I'll start another series of entertaining posts about solving mathematical/logic puzzles using R, of which this post you're reading is the first. This time around, we'll take a crack at the Problem of the Five Sailors, One Monkey and a Pile of Coconuts. Read on to find out what this is all about.

The Puzzle

This problem first appeared in the Saturday Evening Post in the 1920's and was later popularized by Martin Gardner. It goes as follows:

Five sailors arrive at a deserted island that has only coconuts and one monkey. After laboriously collecting all the coconuts into one big pile, the sailors all agree to divide up the coconuts into equal shares the next morning. However, in the middle of the night a sailor wakes up and, not trusting the others, decides to give the monkey one coconut (to buy his silence?) and takes away his 1/5 share of the coconuts, leaving the rest in the pile. The other 4 sailors wake up at different time intervals and each does the same thing: give one coconut to the monkey and stash away his 1/5 share of the coconuts that remain in the pile. In the morning they divide what is left of the pile into equal shares and there is still one coconut left for the monkey. How many coconuts were in the pile the day before?

Readers in the 1920's did not have the benefit of Google or the Internet so after racking their brains for a while, many gave up in frustration and soon, the premises of the newspaper were flooded by an angry mob bearing pitchforks and torches, demanding to know the answer¹. Now, before you fire up the Google search engine for the answer to this conundrum, let's first try a...

Brute-force solution

By brute-force, I really do mean brute as opposed to smart. It would seem that most people think mathematicians arrive at sophisticated answers directly in one step. Not me - I'm a dumb one so I usually try to explore the problem situation and try a little brute-force searching first. I don't actually have to do the computations myself, that's what R is for. Only after I understand the problem more thoroughly do I attempt to tackle it using more formal mathematical artifices.²

Our brute-force strategy will be to try out successive integer values for the size of the coconut pile to search for one where we will be able to deal out the coconut pile to the 5 sailors and the monkey in the conditions described by the problem. For each candidate integer value, we will simulate each of the five sailors taking their shares and giving one to the monkey, and in the end, dividing the pile into five equal shares plus one for the monkey. The key here is that the divisions in each step must be exact, since we can't deal with fractions of coconuts, otherwise the problem would be too easy, wouldn't it? Here is my brute-force solution script:

#------------------------------------------------------------------
# Five Sailors, a monkey and a pile of coconuts
# José Romero - 05/09/2016
#------------------------------------------------------------------

n_sailors <- 5
n_iterations <- 6
max_bound <- 1000000

verify <- function(pile, verbose=FALSE) {
  solution_DF <- data.frame()
  for (i in 1:n_iterations) {
    sailors_share <- pile%/%n_sailors
    for_monkey <- pile%%n_sailors
    if (for_monkey == 1) { 
      solution_DF <- rbind(solution_DF,
                           data.frame("Pile"=pile,
                             "Sailor.s.share"=sailors_share,
                             "For.monkey"=for_monkey))
      pile <- pile - (sailors_share+for_monkey)
    } 
  }
  return(solution_DF)
}

solve_monkey_coconuts <- function(max_bound) {
  for (i in 1:max_bound) {
    if (nrow(verify(i))==6) return(i)
  }
  return(0)
}

solution <- solve_monkey_coconuts(max_bound)
if (solution>0) {
  cat("Number of coconuts in pile: ",solution,"\n")
  print(verify(solution))
} else sprintf("No solution in [1,%d]",max_bound)

     Number of coconuts in pile:  15621 
        Pile Sailor.s.share For.monkey
     1 15621           3124          1
     2 12496           2499          1
     3  9996           1999          1
     4  7996           1599          1
     5  6396           1279          1
     6  5116           1023          1

There it is - the pile originally had 15621 coconuts and there were 1023 coconuts for each sailor (and one for the monkey) at daybreak. The one thing that strikes us when we run the script above is how agonizingly slow it is. Since we initially have no idea how large the solution is, we brute-search in the first one million positive integers using the for loop in lines 27-29³.

To check if an integer candidate is a solution to the problem, we simulate 6 partitionings (one for each of the five sailors and the final division of the coconuts done in the morning). Each partitioning consists of dividing the coconut pile into five equal parts and verifying that the remainder (integer modulo division) is one, which is the coconut for the monkey. If at some point in the process we do not get a remainder of 1, we will be exiting the verfication process, knowing that we could not perform all 6 iterations of it. This is what the code in lines 10-24 does.

Finally, note the output below the R script, indicating how many coconuts remain in the pile, the share for each sailor and the share for the monkey at each iteration. Could we have somehow guessed this would be the answer? Well, we could have ventured that since we were going to divide the pile by 5 six times, the answer would need to be divisible by \(5^6=15626\), this being our approximate answer. Of course, we would have known this could not be the answer to the puzzle because we had not taken into consideration the part about giving one coconut to the monkey at each step, which complicates things a great deal.

Diophantine equations to the rescue

Now that we know the answer to the problem, let's see if we can deduce it mathematically. Let \(C_0\) be the number of coconuts the five sailors originally collected. When the first sailor comes along in the middle of the night, he divides this pile into five equal parts and gives the remaining coconut to the monkey. So if \(S_1\) is his one fifth share of the coconuts, then the following equation holds: \(C_0=5S_1+1\). So when the next sailor comes along, there will be \(C_1=4S_1\) coconuts in the pile, the previous sailor having taken his share, plus one for the monkey. But if he also divides that pile into 5 equal parts and throws the remaining coconut to the monkey, then \(C_1=5S_2+1\) also holds, and so on. Writing down all six of these equations describing the division of coconuts at each stage:

\[\begin{align*} C_0&=1+5S_1\\ C_1=4S_1&=5S_2+1\\ C_2=4S_2&=5S_3+1\qquad\qquad(1)\\ C_3=4S_3&=5S_4+1\\ C_4=4S_4&=5S_5+1\\ C_5=4S_5&=5S_6+1 \end{align*} \]

Oh great, six equations and twelve unknows?! Please bear with me a little longer. With the equations in (1), we can try to express five of the \(S_i\)'s in function of their sucessors \(S_{i+1}\):

\[\begin{align*} S_1&=\frac{1+5S_2}{4}\\ S_2&=\frac{1+5S_3}{4}\\ S_3&=\frac{1+5S_4}{4}\qquad\qquad(2)\\ S_4&=\frac{1+5S_5}{4}\\ S_5&=\frac{1+5S_6}{4} \end{align*}\]

Some backward substitution, starting with the last two equations in (2) and working on up to the top equation there allows us to express \(S_1\) in terms of \(S_6\) and substituting this in the first equation of (1), we get:

\[ C_0=\frac{5}{4}\left(\frac{5}{4}\left(\frac{5}{4}\left(\frac{5}{4}\left(\frac{5}{4}\left(5S_6+1\right)+1\right)+1\right)+1\right)+1\right)+1\qquad(3) \]

Working with this last equation:

\[ C_0=\frac{5^6}{4^5}S_6+\sum_{i=0}^5 \left(\frac{5}{4}\right)^i\qquad(4) \]

In the right hand side of (4) we have a geometric series so we apply the pertinent formula:

\[ C_0=\frac{5^6}{4^5}S_6+\frac{\left(\tfrac{5}{4}\right)^6-1}{\tfrac{5}{4}-1}\qquad(5) \]

This means that:

\[\begin{align*} C_0&=\frac{5^6}{4^5}-4+\frac{5^6}{4^5}S_6\quad\rightarrow\\ 4^5C_0&=5^6-4^6+5^6S_6\quad\rightarrow\quad&(5a)\\ 1024C_0&-15625S_6=11529\qquad\quad&(5b) \end{align*}\]

Equation (5b), which looks just like an innocent linear equation on two variables, is called a linear diophantine equation. If \(C_0\) and \(S_6\) were allowed to assume real values, we simply would have infinite solutions, but there's a catch here: we're looking for integer values of \(C_0\) and \(S_6\) as solutions to (5b). This is typical of diophantine problems, which usually involve less equations than unknowns but in which integer solutions are required.

What comes next is a little sleight of hand. Due to the symmetry in equation (5a), we can see that if \(C_0\) and \(S_6\) are one solution to the equation, so are \(C_0'=C_0+5^6\) and \(S_6'=S_6+4^5\). If we substite \(C_0'\) for \(C_0\) and \(S_6'\) for \(S_6\) in equation (5a) and notice that the \(4^55^6\) terms cancel out on both sides of the equality, the resulting equation is essentially the same:

\[ 4^5(C_0+5^6)=5^6(S_6+4^5)+5^6-4^6 \]

What this all amounts to is that we can find other solutions by adding some constants to each of the variables in a known solution pair. But, how do we find one solution pair? The trick is to add \(4^6\) to both sides of (5a) and to factor out both sides:

\[ 4^5(C_0+4)=5^6(S_6+1)\qquad(6) \]

Now this last equation is easy to solve. We just need to make both sides of (6) equal to zero and obtain \(C_0=-4\) and \(S_6=-1\). But the problem is, how can we have negative coconuts?⁴ This is not really a problem because if we add \(5^6\) to \(C_0\) and \(4^5\) to \(S_6\), we know we can get another solution pair just as mathematically valid, but more sensible, given the problem. We finally get \(C_0=15625-4=15621\) and \(S_6=1024-1=1023\), which concur with the solutions we obtained by brute-force.

Taking a crack at diophantine equations with R

We're not about to give up on trying to find a more efficient and possibly vectorized way to solve this problem using R. I looked for an R package to solve diophantine equations and found that the "elliptic" package has a function called "congruence" which deals with linear diophantine equations of the type \(mx+by=l\).

To be honest, I could never get this function to give me the solution to equation (5b). Poring through the package's vignette, I encountered some frightening number theoretic stuff, so given my ignorance and aversion to number theory⁵, I gave up on using this package but before doing so, I inspected the R code for the congruence function:

function (a, l = 1) 
{
    l <- as.integer(l)
    m <- as.integer(a[1])
    n <- as.integer(a[2])
    zero <- as.integer(0)
    one <- as.integer(1)
    if (m == zero & n == one) {
        return(NULL)
    }
    if (m == one & n == zero) {
        return(c(NA, 1))
    }
    if (m == 1) {
        return(rbind(c(a), c(1, n + l), c(0, l)))
    }
    if (n == 1) {
        return(rbind(c(a), c(m - l, 1), c(l, 0)))
    }
    q1 <- which((+l + (1:m) * n)%%m == zero)
    if (!any(q1)) {
        q1 <- NA
    }
    q2 <- which((-l + (1:n) * m)%%n == zero)
    if (!any(q2)) {
        q2 <- NA
    }
    out <- rbind(a, cbind(q1, q2))
    rownames(out) <- NULL
    colnames(out) <- NULL
    return(out)
}

Hmmm, the code above doesn't look so inscrutable. I mean, it uses the expected integer modulo operator in R (the %%'s in the code). Maybe if I knew a little bit more about diophantine equations I could reimplement this function from scratch and get it to work. So I looked up the Wikipedia article on diophantine equations and found the following theorem:

The linear diophantine equation \(ax+by=c\), where \(a\),\(b\),\(c\),\(x\) and \(y\) are integers, has a solution if and only if \(c\) is a multiple of the greatest common divisor of \(a\) and \(b\). Moreover, if \((x, y)\) is a solution, then the other solutions have the form \((x + kv,y − ku)\), where \(k\) is an arbitrary integer, and \(u\) and \(v\) are the quotients of \(a\) and \(b\) (respectively) by the greatest common divisor of \(a\) and \(b\).

This theorem is really handy for two reasons: (1) it gives us the conditions for the existence of solutions to the diophantine equation and (2) it gives us a mechanism for generating infinitely more solutions if we are able to find one solution. This last point, finding one solution, is what I would have to tackle in my implementation of the code.

I decided that I would look for a first solution by separating each variable to one side of the \(ax+by=c\) equation and getting \(ax=c-by\) or \(by=c-ax\). Then, taking \(ax=c-by\) for example, I would have my code try out a range of values for \(y\) to verify when the expression \(c-by\) is divisible by \(a\). Intuitively, I realized that the range of values to try for \(y\) would be all integers between 0 and \(a\) (when using \(by=c-ax\), I would have to try all integers between 0 and \(b\) for \(x\)), so it's not like I would have to search through an infinitely large solution space.

Having found values of \(x\) or \(y\), which are guaranteed to exist if the solution existence conditions in the Wikipedia theorem are met, I would substitute this value in the equation to obtain the other value. Finally, I would have to implement a function for finding the greatest common divisor. All in all, I came up with the following code:

gcd <- function(m,n) {
  #greatest common divisor function
  m <- as.integer(abs(m))
  n <- as.integer(abs(n))
  divm <- (1:m)[which(m%%(1:m)==0)]
  divn <- (1:n)[which(n%%(1:n)==0)]
  return(max(divm[which(divm%in%divn)]))
} #gcd

diophantine <- function (m,n,l) {
  #solves the linear diophantine equation: mx+ny=l
  l <- as.integer(l)
  m <- as.integer(m)
  n <- as.integer(n)
  gcd <- gcd(m,n)
  if (l%%gcd!=0) return(NA) #no solution exists
  #otherwise obtain one solution (solx,soly)
  if (abs(m)>abs(n)) {
    #use multiples of the lower magnitude n
    mult <- (0:n)[which((l-m*(0:n))%%n==0)]
    solx <- mult[1]; soly <- (l-m*solx)%/%n
  } else {
    #use multiples of the lower magnitude m
    mult <- (0:m)[which((l-n*(0:m))%%m==0)]
    soly <- mult[1]; solx <- (l-n*soly)%/%m
  }
  return(rbind(c(solx,soly),c(n%/%gcd,-m%/%gcd)))
} #diophantine

diophantine(1024,-15625,11529)

##        [,1]  [,2]
## [1,]  15621  1023
## [2,] -15625 -1024

As you can see, it works like a charm. This is a code I can be proud of (look ma, no for/while loops!). My function not only gives me the solution in the first row of the results matrix, but gives me the coefficients whose multiples I would have to add to \(x\) and \(y\) to get other solution pairs. I haven't really proven that it works for all possible values of m, n and l, but I tried a few example problems which I solved by hand and it seems to be ok. Needless to say, I invite my readers to put this code under close scrutiny (you know what they say about many pairs of eyes seeing more than one). Perhaps some of you will find errors in the code, or a test-case which "breaks" the code. Also, I invite my readers to come up with an R function for solving general 3 or more variable linear diophantine equations. For further discussion or questions of this, I invite you to participate in the comments for this post.

Notes

I guess I got carried away with that pitchfork and torch bearing angry mob bit, but I do wish we Venezuelans would organize into such a mob to evict Maduro from power. Ironically, it seems like in the days when there were no social networks or smartphones, people somehow organized into action more effectively - for example during the 1958 ousting of the Perez-Jiménez dictatorship. Now, we somehow expect the Internet to "do things for us", while we stay at home.
When teaching math, it is never a good idea to just go ahead and show students the "clean" mathematical solution. Teaching math is all about modeling the problem-solving process, which in real life, is usually a messy, trial-and-error roundabout affair. When we have become experts in making mistakes and following dead-end leads, we will have gained sufficient familiarity with the problem and only then can we hope to come up with a clean solution.
R is not particularly fast when running for/while loops. What R programmers always strive to do is to vectorize their code. Vectorization consists of avoiding for/while loops all together and working with vectors and functions of vectors. In the next script, we will be using vectorized code. So you could say that in R, using for or while loops in brute-search is the most brutish way to go about it.
I wonder what negative coconuts taste like...
My apologies to number-theorists (they would cringe at a blog like this anyways), but number theory seems to me like the most useless branch of mathematics. I mean, of what practical use is Fermat's Last Theorem or Goldbach's Conjecture? Sure, probably the most difficult and even unresolved or unproven math problems are buried deep within number theory. Sure, solving these problems would be an instant tiket to mathematical fame and fortune - the stuff of legends. As for me, I prefer to steer clear of things for which I just don't have the talent, so that I won't suffer the same fate as Uncle Petros. Uncle Petros, by the way, is the main character in an extremely entertaining novel about mathematics by Apostolos Doxiadis, called Uncle Petros and Goldbach's Conjecture

Bibliography

Diophantine equation. (2016, July 26). In Wikipedia, The Free Encyclopedia. Retrieved September 4, 2016, from https://en.wikipedia.org/wiki/Diophantine_equation
HANKIN, R. K. S. (February 2006). Introducing Elliptic, an R package for Elliptic and Modular Functions. Journal of Statistical Software, Volume 15, Issue 7.
PADILLA, T. and McPARTLAN, P. [Numberphile] (2015, April 29). Monkeys and Coconuts - Numberphile [Video file]. Retrieved September 4, 2016, from https://www.youtube.com/watch?v=U9qU20VmvaU
.
R DEVELOPMENT CORE TEAM (2009). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. http://www.R-project.org.

If you found this post interesting or useful, please share it on Google+, Facebook or Twitter so that others may find it too.

miércoles, 31 de agosto de 2016

Simulation models of epidemics using R and simecol

In this post we'll dip our toes into the waters of epidemological dynamics models, using R and simecol, as we have done in the previous two posts of this series. These models of epidemics are interesting in that they introduce us to a more general class of models called compartment models, commonly used in the study of biological systems. This "compartment point of view" will prove to be an useful tool in modeling, as we shall see in future posts. As usual, after a brief theoretic and mathematical rundown of the various types of epidemic models, we shall fit one of those models to some data using R's simecol package.

Infectious or Epidemic Processes

When studying the spread of infectious diseases, we must take into account the possible states of a host with respect to the disease:

Susceptible (S) These are the individuals susceptible to contracting the disease if they're exposed to it.
Exposed (E) These individuals have contracted the disease but are not yet infectious, thus still not able to pass the disease unto the susceptible (S) group.
Infectious (I) After having been exposed, these individuals are now abel to pass the disease unto other individuals in the S group. This group should not be confused with the infected group of individuals, which are those that are either exposed (E) or infectious (I).
Recovered (R) Having gone through the previous phases of the disease, individuals recover because they have developed an immunity to the disease (permanent or temporary). This group is neither susceptible to the disease nor infectious¹.

Not all epidemological models will include all four of the above classes or groups, and some models with greater degree of sophistication may include more (Berec, 2010). For example, there may be diseases for which it is useful to consider a group of chronic carriers. Moreover, infected mothers may have been previously infected with a disease and, having developed antibodies, pass these to their newborn infants through the placenta. These newborn infants are temporarily immune to the disease, but later move into the susceptible (S) group after the maternal antibodies disappear. These newborn infants would then constitute a group of passively immune (M) individuals. On the other hand, it is posible that all these groups may not be as homogenous as one may think. Different social, cultural, demographic or geographical factors may affect or be related to the mode of transmission, rate of contact, infectious control, and overall genetic susceptibility and resistance.

Based on the selection or ommision of these classes or compartments, which in itself implies some assumptions about the characteristics of the specific disease we're trying to model, there are several accronyms used to name these models. For example, in an SI model, also known as a simple epidemic model, the host never recovers. An SIS model is one in which the susceptible population becomes infected and then after recovering from the disease and its symptoms, becomes susceptible again. The SIS model implies that the host population never becomes immune to the disease. An SEIR model is one in which there is an incubation period: susceptible individuals first become exposed (but not yet infectious), later enter the infectious group when the disease is incubated, and finally, they enter the R group when they cease to become infectious and develop immunity. An SIR model is basically the same as the SEIR model, but without an incubation period, etc.

Compartment models

We can see from the last paragraph on different epidemic models that these attempt to describe how the individuals in a population leave one group and enter another. This is characteristic of compartment, or "box" models, in which individuals in a population are classified at any given time into different compartments or boxes according to the state in which they find themselves. Compartment models are commonly used to study biological or chemical systems in which the main interest is the transport or change of state of materials or substances over time².

In the process of dynamic model building, it is often useful to think of the state variables as boxes or compartments that "hold" a certain amount of substance at any given time, even though "substance" or "material" may refer to discrete entities such as individuals. In fact, because system dynamic models are governed by differential equations, modeling discrete entities through quantities that vary continously is inevitable. Notwithstanding, thinking in terms of compartments is useful because it forces us to reflect on the mechanisms that affect how individuals enter or leave a compartment. For example, in the Lanchester model of the Battle of Iwo Jima there were two compartments: American troops and Japanese troops. In that case, individuals would not leave one compartment to enter another, because we were not contemplating soldiers defecting from their original army to fight for the other side. In the epidemic models, however, we do contemplate individuals leaving one compartment and entering another as they progresss through the various stages of the disease. The utility of the "compartment point of view" in modeling lies in the principle that if there are group of entities whose dynamic behavior is different from other groups, even though they are apparently the same type of entity, they should each have their own compartment. We shall explore this idea in more detail when we cover population growth models in a future post in this blog.

SIS models

SIS models of epidemics comprise two compartments: S (susceptible) and I (infected). In what follows we will assume that:

There is a finite population \(N\) that remains constant at all times.
The epidemic cycle is sufficiently short to assume that births, deaths, immigration/emmigration or any other event that modifies the population do not occur.
Individuals transition from being susceptible to the disease, then to being infected, and back again to being susceptible as they recover. The disease is such that no permanent or temporary immunity is developped, and there are no mortalities.
The number of infections of susceptible individuals that occur is proportional to the number of contacts between infected and susceptible individuals.
At each time unit, a certain percentage \(\kappa\) of infected individuals recover from the disease and become susceptible again. This amounts to stating that the mean duration of the infected/infectious period is \(1/\kappa\) time units.

This is a very simple model and some of the above assumptions may be unrealistic or inapplicable to all SIS epidemic processes but nevertheless, it is an useful model for illustrating the emmergent behavior of some systems. These assumptions are usually expressed in the literature in differential equation form as follows:

\[\frac{dS}{dt}=\kappa I - \alpha\gamma S\cdot I\] \[\frac{dI}{dt}=-\kappa I + \alpha\gamma S\cdot I\]

As already mentioned, the parameter \(\kappa\) represents the percentage of infected individuals that recover at each time unit and become susceptible again. Parameter \(\alpha\) represents the percentage of contacts that result in an infection and \(\gamma\) is the rate of contact between infected and susceptible people. Having these two last parameters in a model may render the model impossible to fit using simecol, since there are infinetly many combinations of \(\alpha\) and \(\gamma\) that result in the same value for their product \(\alpha\gamma\)³. Besides, in principle one should strive to have as few parameters as possible in a model - all the more so in this case since the two parameters can be considered as one: \(\beta=\alpha\gamma\), which could be taken to represent the percentage of cases from the overall susceptible and infected populations that effectively result in an infection. So the resulting model is:

\[\frac{dS}{dt}=\kappa I - \beta S\cdot I\] \[\frac{dI}{dt}=-\kappa I + \beta S\cdot I\]

At a glance, one thing becomes apparent in this model: the rate of flow into compartment S is the same as the rate of flow out of compartment I, so that at any two given time instants \(S_{t+a}+I_{t+a}=S_t+I_t\). The total number of individuals that are either in compartment S or in compartment I is invariable, in keeping with one of the assumptions laid down for this model.

Let's try to understand the implications of this assumption intuitively. At one instant, susceptible individuals are becoming infected, but at some later instant, these infected individuals are becoming susceptible again. There is a negative feedback mechanism in place here because each compartment relies on a greater quantity of individuals on the other compartment to grow in numbers. In other words, the lack of susceptible individuals ensures that the population of infected individuals will start to decrease, which can only mean that there will be more susceptible individuals in the future, as the overall population remains constant. Indeed, the SIS model represents a system that works its way towards equilibrium.

And precisely what is this equilibrium? Can we expect the disease to eventually extinguish itself or on the contrary, will everyone become infected? Or will the system perhaps work its way towards a certain fixed amount of infected and susceptible individuals? Answering questions of this type is one of the aims or dynamic system modeling. We usually want to determine the emmergent behavior of the system that results from the mathematical properties hidden deep within the differential equations.

In our case of this simple SIS model of epidemics, we don't have to dig too deep. A simple mathematical procedure will sufice to determine this equilibrium state analitically. In a state of equilibrium, the state variables do not change, so any of the above derivatives can be set to 0:

\[\kappa I - \beta S\cdot I=0\qquad\rightarrow\] \[\kappa (N-S) - \beta S(N-S)=0\qquad\rightarrow\] \[\beta S^2 - (\kappa+\beta N)S+\kappa N=0\]

The last equation is a simple cuadratic equation whose roots are \(S=\kappa/\beta\) and \(S=N\)- these are the equilibrium states for the number of susceptible individuals. If the equilibrium state for the susceptible population S is greater than or equal to N, everyone will eventually become susceptible and no one will be infected \(I=N-N=0\): the disease is extinguished. If on the contrary no one recovers from the infection (\(\kappa=0\)), then everyone will eventually be infected and there will be no susceptibles. Otherwise, \(0\leq\kappa/\beta\leq N\) and the number of steady-state susceptibles will be closer to N for large values of \(\kappa\) (high recovery rates) and low values of \(\beta\) (low rate of effective contagious contacts).

The following R/simecol simulation model, run for three different combinations of the \(\kappa\) and \(\beta\) parameters serve to validate our simple mathematical analysis. Each simulation run starts with \(I=5\) infected individuals and \(S=95\) susceptibles. But regardless of the initial values, we can see that the steady-state susceptible population reaches \(\kappa/\beta\) or \(N\), if \(\kappa/\beta\gt N\) in all simulation runs.

library(simecol)
N <- 100  #total population size
sis <- odeModel(
  main=function(t,y,parms){
    p <- parms
    dS <- p["k"]*y["I"]-p["b"]*y["S"]*y["I"]
    dI <- -p["k"]*y["I"]+p["b"]*y["S"]*y["I"]
    list(c(dS,dI))
  },
  times=c(from=0,to=20,by=0.01),
  init=c(S=N-5,I=5),
  parms=c(k=1,b=0.003),
  solver="lsoda"
)
simsis <- sim(sis)
plot(simsis)

We run the SIS model script using different values for the parameters \(\beta\) (b in the code) and \(\kappa\) (k in the code), to illustrate the steady state behavior of the system. Notice that the plot method in line 16, when called with an odeModel object, produces an individual plot for each state variable.

SIS model simulation run with k=1, b=0.003

SIS model simulation run with k=1, b=0.0125

SIS model simulation run with k=1, b=0.05

SIR models

The SIR model of disease was first proposed in 1927 by Kermack and McKendrick, hence the alternative denomination of Kermack-McKendrick epidemic model. With this model, researchers sought to answer questions as to why infectious diseases suddenly errupt and expire without leaving everyone infected. In this regard, the Kermack-McKendric model succeeded and was considered one of the early triumphs of mathematical epidemology⁴.

I should mention in passing that the basic SIR or Kermack-McKendric model asumes a disease transmission rate given by the \(\beta SI\) term, a form called mass-action incidence or density dependent transmission. When the disease transmission rate is \(\beta SI/N\) (\(N\) being the overall population), the disease transmission mechanism implicit in the model is known as standard incidence or frequency dependent transmission.

The use of one or another model of disease transmission between infected hosts and susceptibles and the circumstances under which either one is more appropriate for modeling the onslaught of a disease is a somewhat controversial issue. It has been generally suggested, however, that the mass-action incidence form is more appropriate for air-borne diseases for example, where doubling the population also doubles the rate at which the disease is transmitted. On the other hand, sexually transmitted diseases are best modeled by the frequency dependent transmission form, because the transmission depends on the mean frequency of sexual contacts per person, which is basically unrelated to the size of the susceptible population.

In view of these considerations, we could appropriately model short-lived and mass-action incidence diseases by the SIR model given below:

\[\frac{dS}{dt}=-\beta SI\] \[\frac{dI}{dt}=\beta SI - \kappa I\] \[\frac{dR}{dt}=\kappa I\]

We will apply this (mass-action incidence) SIR model to an epidemic event involving a case of influenza in an English boarding school for boys, as reported by anonymous authors to the British Medical Journal in 1978⁵. This influenza outbreak apparently began by a single infected student in a population of 763 resident boys (including that student), spanning a period of 15 days. The data furnished refers only to the number of bedridden patients each day- these will be taken as the infected population numbers⁶.

Day	0	1	2	3	4	5	6	7	8	9	10	11	12	13	14
I	1	3	8	28	75	221	291	255	235	190	125	70	28	12	5

With this data, we're ready to code the R/simecol script for fitting the SIR model:

library(simecol)
flu_data <- data.frame(
  time=0:14,
  S=c(762,rep(NA,14)),
  I=c(1,3,8,28,75,221,291,255,235,190,125,70,28,12,5),
  R=c(0,rep(NA,14))
)
flu_weights <- data.frame(
  S=c(1,rep(0,14)),
  I=rep(1,15),
  R=c(1,rep(0,14))
)
sir <- odeModel(
  main=function(t,y,parms) {
    p <- parms
    dS <- -p["b"]*y["S"]*y["I"]
    dI <- p["b"]*y["S"]*y["I"]-p["k"]*y["I"]
    dR <- p["k"]*y["I"]
    list(c(dS,dI,dR))
  },
  times = c(from=0,to=14,by=0.1),
  init=c(S=762,I=1,R=0),
  parms=c(k=0.5,b=0.002),
  solver="lsoda"
)
result_fit <- fitOdeModel(
  sir,
  obstime=flu_data$time,
  yobs=flu_data[2:4],
  fn=ssqOdeModel,
  weights=flu_weights,
  method="PORT",
  lower=c(k=0,b=0))
result_fit
parms(sir) <- result_fit$par
times(sir) <- c(from=0,to=20,by=0.1)
sir <- sim(sir)
matplot(
  x=sir@out,
  main="",
  xlab = "Day",
  ylab = "Population",
  type = "l", lty = c("solid", "solid","solid"), 
  col = c("blue","red","darkgreen"))
points(flu_data$time,flu_data$I,col="red",pch=19)

Influenza outbreak in an English boarding school (1978)

The results of the simulation of the influenza outbreak indicate that over 20 students remained susceptible and everyone else was infected. Epidemic diseases do not always affect the entire population- whether they do or not depends on the transmission and recovery rates. In fact, looking at the differential equation for the I compartment, we can easily see that the \(dI/dt\) is 0 only when \(I=0\), which is the trivial case when there are no longer infected individuals to infet anyone else and so the epidemic has died out or when \(S=\kappa/\beta\).

This special value \(\kappa/\beta\) is known as the threshold value of the disease because when \(S\) reaches this threshold value, the disease begins to recede (the infected population is decreasing). If at the outset of the epidemic, the susceptible population number is less than the threshold value, the disease will not invade. One can think of this intuitively as the point when the infectious disease "runs out of fuel" because the infected patients are recovering faster than the rate at which the susceptibles are contacting the disease.

If you play around with the value of \(\kappa\), you will see that if you increase this value, there will be less individuals infected and the contrary if you decrease \(\kappa\). This makes sense, as the larger the percentage of the infected population that recovers at each time period, the less the duration of the disease. What this means is that a fatal disease is much more dangerous if it kills slowly and those fatal diseases that kill (remove) their hosts quickly are likely to have a shorter life themselves⁷.

Notes

There are indeed mortal diseases from which individuals do not recover. Still, recovered and deceased individuals should be dealt with differently in the model (and in real life), since the purpose of the model is to study the ocurrence of these two very different events.
See BLOMHØJ et al.
We hit upon the topic of parameter identifiability, which will be dealt with in a future post.
See BEREC, p. 12.
The data can be found in SULSKY.
See CLARK.
See "SIR Model of Epidemics" (TASSIER, pp. 5-6).

Bibliography

BEREC, L. (2010). Modeling Infectious Diseases in Humans and Animals (Lecture Notes).
BLOMHØJ, M., KJELDSEN, T.H., and OTTESEN, J. (2014). Compartment Models. Retrieved August 2016, from http://www4.ncsu.edu/~msolufse/Compartmentmodels.pdf.
CLARK, A. (2005). "S-I-R Model of Epidemics Part 1 - Basic Model and Exapmples". Retrieved June 2011 from www.me.rochester.edu/courses/ME406/webexamp5/sir1.pdf
PETZOLDT, T. and RINKE , K. (2007). "simecol: An Object-Oriented Framework for Ecological Modeling in R". Journal of Statistical Software, 22(9), pp. 1–31. ISSN 1548-7660. http://www.jstatsoft.org/v22/i09.
R DEVELOPMENT CORE TEAM (2009). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. http://www.R-project.org.
SULSKY, D. (2012). "Using Real Data in an SIR Model". Retrieved August 2016 from http://math.unm.edu/~sulsky/mathcamp/ApplyData.pdf
TASSIER, T. (2005). "Epidemics and Development Policy" Course Lectures: Fordham University. Retrieved June 2011, from http://www.fordham.edu/academics/programs_at_fordham_/economics/faculty/tassier/teaching/epidemics_15958.asp