miércoles, 31 de agosto de 2016

Simulation models of epidemics using R and simecol

In this post we'll dip our toes into the waters of epidemological dynamics models, using R and simecol, as we have done in the previous two posts of this series. These models of epidemics are interesting in that they introduce us to a more general class of models called compartment models, commonly used in the study of biological systems. This "compartment point of view" will prove to be an useful tool in modeling, as we shall see in future posts. As usual, after a brief theoretic and mathematical rundown of the various types of epidemic models, we shall fit one of those models to some data using R's simecol package.



Infectious or Epidemic Processes

When studying the spread of infectious diseases, we must take into account the possible states of a host with respect to the disease:
  • Susceptible (S) These are the individuals susceptible to contracting the disease if they're exposed to it.
  • Exposed (E) These individuals have contracted the disease but are not yet infectious, thus still not able to pass the disease unto the susceptible (S) group.
  • Infectious (I) After having been exposed, these individuals are now abel to pass the disease unto other individuals in the S group. This group should not be confused with the infected group of individuals, which are those that are either exposed (E) or infectious (I).
  • Recovered (R) Having gone through the previous phases of the disease, individuals recover because they have developed an immunity to the disease (permanent or temporary). This group is neither susceptible to the disease nor infectious1.
Not all epidemological models will include all four of the above classes or groups, and some models with greater degree of sophistication may include more (Berec, 2010). For example, there may be diseases for which it is useful to consider a group of chronic carriers. Moreover, infected mothers may have been previously infected with a disease and, having developed antibodies, pass these to their newborn infants through the placenta. These newborn infants are temporarily immune to the disease, but later move into the susceptible (S) group after the maternal antibodies disappear. These newborn infants would then constitute a group of passively immune (M) individuals. On the other hand, it is posible that all these groups may not be as homogenous as one may think. Different social, cultural, demographic or geographical factors may affect or be related to the mode of transmission, rate of contact, infectious control, and overall genetic susceptibility and resistance.

Based on the selection or ommision of these classes or compartments, which in itself implies some assumptions about the characteristics of the specific disease we're trying to model, there are several accronyms used to name these models. For example, in an SI model, also known as a simple epidemic model, the host never recovers. An SIS model is one in which the susceptible population becomes infected and then after recovering from the disease and its symptoms, becomes susceptible again. The SIS model implies that the host population never becomes immune to the disease. An SEIR model is one in which there is an incubation period: susceptible individuals first become exposed (but not yet infectious), later enter the infectious group when the disease is incubated, and finally, they enter the R group when they cease to become infectious and develop immunity. An SIR model is basically the same as the SEIR model, but without an incubation period, etc.

Compartment models

We can see from the last paragraph on different epidemic models that these attempt to describe how the individuals in a population leave one group and enter another. This is characteristic of compartment, or "box" models, in which individuals in a population are classified at any given time into different compartments or boxes according to the state in which they find themselves. Compartment models are commonly used to study biological or chemical systems in which the main interest is the transport or change of state of materials or substances over time2.

In the process of dynamic model building, it is often useful to think of the state variables as boxes or compartments that "hold" a certain amount of substance at any given time, even though "substance" or "material" may refer to discrete entities such as individuals. In fact, because system dynamic models are governed by differential equations, modeling discrete entities through quantities that vary continously is inevitable. Notwithstanding, thinking in terms of compartments is useful because it forces us to reflect on the mechanisms that affect how individuals enter or leave a compartment. For example, in the Lanchester model of the Battle of Iwo Jima there were two compartments: American troops and Japanese troops. In that case, individuals would not leave one compartment to enter another, because we were not contemplating soldiers defecting from their original army to fight for the other side. In the epidemic models, however, we do contemplate individuals leaving one compartment and entering another as they progresss through the various stages of the disease. The utility of the "compartment point of view" in modeling lies in the principle that if there are group of entities whose dynamic behavior is different from other groups, even though they are apparently the same type of entity, they should each have their own compartment. We shall explore this idea in more detail when we cover population growth models in a future post in this blog.

SIS models

SIS models of epidemics comprise two compartments: S (susceptible) and I (infected). In what follows we will assume that:

  1. There is a finite population \(N\) that remains constant at all times.
  2. The epidemic cycle is sufficiently short to assume that births, deaths, immigration/emmigration or any other event that modifies the population do not occur.
  3. Individuals transition from being susceptible to the disease, then to being infected, and back again to being susceptible as they recover. The disease is such that no permanent or temporary immunity is developped, and there are no mortalities.
  4. The number of infections of susceptible individuals that occur is proportional to the number of contacts between infected and susceptible individuals.
  5. At each time unit, a certain percentage \(\kappa\) of infected individuals recover from the disease and become susceptible again. This amounts to stating that the mean duration of the infected/infectious period is \(1/\kappa\) time units.
This is a very simple model and some of the above assumptions may be unrealistic or inapplicable to all SIS epidemic processes but nevertheless, it is an useful model for illustrating the emmergent behavior of some systems. These assumptions are usually expressed in the literature in differential equation form as follows:

\[\frac{dS}{dt}=\kappa I - \alpha\gamma S\cdot I\] \[\frac{dI}{dt}=-\kappa I + \alpha\gamma S\cdot I\]
As already mentioned, the parameter \(\kappa\) represents the percentage of infected individuals that recover at each time unit and become susceptible again. Parameter \(\alpha\) represents the percentage of contacts that result in an infection and \(\gamma\) is the rate of contact between infected and susceptible people. Having these two last parameters in a model may render the model impossible to fit using simecol, since there are infinetly many combinations of \(\alpha\) and \(\gamma\) that result in the same value for their product \(\alpha\gamma\)3. Besides, in principle one should strive to have as few parameters as possible in a model - all the more so in this case since the two parameters can be considered as one: \(\beta=\alpha\gamma\), which could be taken to represent the percentage of cases from the overall susceptible and infected populations that effectively result in an infection. So the resulting model is:

\[\frac{dS}{dt}=\kappa I - \beta S\cdot I\] \[\frac{dI}{dt}=-\kappa I + \beta S\cdot I\]
At a glance, one thing becomes apparent in this model: the rate of flow into compartment S is the same as the rate of flow out of compartment I, so that at any two given time instants \(S_{t+a}+I_{t+a}=S_t+I_t\). The total number of individuals that are either in compartment S or in compartment I is invariable, in keeping with one of the assumptions laid down for this model.

Let's try to understand the implications of this assumption intuitively. At one instant, susceptible individuals are becoming infected, but at some later instant, these infected individuals are becoming susceptible again. There is a negative feedback mechanism in place here because each compartment relies on a greater quantity of individuals on the other compartment to grow in numbers. In other words, the lack of susceptible individuals ensures that the population of infected individuals will start to decrease, which can only mean that there will be more susceptible individuals in the future, as the overall population remains constant. Indeed, the SIS model represents a system that works its way towards equilibrium.

And precisely what is this equilibrium? Can we expect the disease to eventually extinguish itself or on the contrary, will everyone become infected? Or will the system perhaps work its way towards a certain fixed amount of infected and susceptible individuals? Answering questions of this type is one of the aims or dynamic system modeling. We usually want to determine the emmergent behavior of the system that results from the mathematical properties hidden deep within the differential equations.

In our case of this simple SIS model of epidemics, we don't have to dig too deep. A simple mathematical procedure will sufice to determine this equilibrium state analitically. In a state of equilibrium, the state variables do not change, so any of the above derivatives can be set to 0:

\[\kappa I - \beta S\cdot I=0\qquad\rightarrow\] \[\kappa (N-S) - \beta S(N-S)=0\qquad\rightarrow\] \[\beta S^2 - (\kappa+\beta N)S+\kappa N=0\]
The last equation is a simple cuadratic equation whose roots are \(S=\kappa/\beta\) and \(S=N\)- these are the equilibrium states for the number of susceptible individuals. If the equilibrium state for the susceptible population S is greater than or equal to N, everyone will eventually become susceptible and no one will be infected \(I=N-N=0\): the disease is extinguished. If on the contrary no one recovers from the infection (\(\kappa=0\)), then everyone will eventually be infected and there will be no susceptibles. Otherwise, \(0\leq\kappa/\beta\leq N\) and the number of steady-state susceptibles will be closer to N for large values of \(\kappa\) (high recovery rates) and low values of \(\beta\) (low rate of effective contagious contacts).

The following R/simecol simulation model, run for three different combinations of the \(\kappa\) and \(\beta\) parameters serve to validate our simple mathematical analysis. Each simulation run starts with \(I=5\) infected individuals and \(S=95\) susceptibles. But regardless of the initial values, we can see that the steady-state susceptible population reaches \(\kappa/\beta\) or \(N\), if \(\kappa/\beta\gt N\) in all simulation runs.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
library(simecol)
N <- 100  #total population size
sis <- odeModel(
  main=function(t,y,parms){
    p <- parms
    dS <- p["k"]*y["I"]-p["b"]*y["S"]*y["I"]
    dI <- -p["k"]*y["I"]+p["b"]*y["S"]*y["I"]
    list(c(dS,dI))
  },
  times=c(from=0,to=20,by=0.01),
  init=c(S=N-5,I=5),
  parms=c(k=1,b=0.003),
  solver="lsoda"
)
simsis <- sim(sis)
plot(simsis)

We run the SIS model script using different values for the parameters \(\beta\) (b in the code) and \(\kappa\) (k in the code), to illustrate the steady state behavior of the system. Notice that the plot method in line 16, when called with an odeModel object, produces an individual plot for each state variable.

SIS model simulation run with k=1, b=0.003

SIS model with k=1, b=0.003

SIS model simulation run with k=1, b=0.0125

SIS model with k=1, b=0.0125

SIS model simulation run with k=1, b=0.05

SIS model with k=1, b=0.05

SIR models

The SIR model of disease was first proposed in 1927 by Kermack and McKendrick, hence the alternative denomination of Kermack-McKendrick epidemic model. With this model, researchers sought to answer questions as to why infectious diseases suddenly errupt and expire without leaving everyone infected. In this regard, the Kermack-McKendric model succeeded and was considered one of the early triumphs of mathematical epidemology4.

I should mention in passing that the basic SIR or Kermack-McKendric model asumes a disease transmission rate given by the \(\beta SI\) term, a form called mass-action incidence or density dependent transmission. When the disease transmission rate is \(\beta SI/N\) (\(N\) being the overall population), the disease transmission mechanism implicit in the model is known as standard incidence or frequency dependent transmission.

The use of one or another model of disease transmission between infected hosts and susceptibles and the circumstances under which either one is more appropriate for modeling the onslaught of a disease is a somewhat controversial issue. It has been generally suggested, however, that the mass-action incidence form is more appropriate for air-borne diseases for example, where doubling the population also doubles the rate at which the disease is transmitted. On the other hand, sexually transmitted diseases are best modeled by the frequency dependent transmission form, because the transmission depends on the mean frequency of sexual contacts per person, which is basically unrelated to the size of the susceptible population.

In view of these considerations, we could appropriately model short-lived and mass-action incidence diseases by the SIR model given below:

\[\frac{dS}{dt}=-\beta SI\] \[\frac{dI}{dt}=\beta SI - \kappa I\] \[\frac{dR}{dt}=\kappa I\]
We will apply this (mass-action incidence) SIR model to an epidemic event involving a case of influenza in an English boarding school for boys, as reported by anonymous authors to the British Medical Journal in 19785. This influenza outbreak apparently began by a single infected student in a population of 763 resident boys (including that student), spanning a period of 15 days. The data furnished refers only to the number of bedridden patients each day- these will be taken as the infected population numbers6.

Day 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
I 1 3 8 28 75 221 291 255 235 190 125 70 28 12 5

With this data, we're ready to code the R/simecol script for fitting the SIR model:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
library(simecol)
flu_data <- data.frame(
  time=0:14,
  S=c(762,rep(NA,14)),
  I=c(1,3,8,28,75,221,291,255,235,190,125,70,28,12,5),
  R=c(0,rep(NA,14))
)
flu_weights <- data.frame(
  S=c(1,rep(0,14)),
  I=rep(1,15),
  R=c(1,rep(0,14))
)
sir <- odeModel(
  main=function(t,y,parms) {
    p <- parms
    dS <- -p["b"]*y["S"]*y["I"]
    dI <- p["b"]*y["S"]*y["I"]-p["k"]*y["I"]
    dR <- p["k"]*y["I"]
    list(c(dS,dI,dR))
  },
  times = c(from=0,to=14,by=0.1),
  init=c(S=762,I=1,R=0),
  parms=c(k=0.5,b=0.002),
  solver="lsoda"
)
result_fit <- fitOdeModel(
  sir,
  obstime=flu_data$time,
  yobs=flu_data[2:4],
  fn=ssqOdeModel,
  weights=flu_weights,
  method="PORT",
  lower=c(k=0,b=0))
result_fit
parms(sir) <- result_fit$par
times(sir) <- c(from=0,to=20,by=0.1)
sir <- sim(sir)
matplot(
  x=sir@out,
  main="",
  xlab = "Day",
  ylab = "Population",
  type = "l", lty = c("solid", "solid","solid"), 
  col = c("blue","red","darkgreen"))
points(flu_data$time,flu_data$I,col="red",pch=19)

Influenza outbreak in an English boarding school (1978)

Influenza outbreak in an English boarding school
The results of the simulation of the influenza outbreak indicate that over 20 students remained susceptible and everyone else was infected. Epidemic diseases do not always affect the entire population- whether they do or not depends on the transmission and recovery rates. In fact, looking at the differential equation for the I compartment, we can easily see that the \(dI/dt\) is 0 only when \(I=0\), which is the trivial case when there are no longer infected individuals to infet anyone else and so the epidemic has died out or when \(S=\kappa/\beta\).

This special value \(\kappa/\beta\) is known as the threshold value of the disease because when \(S\) reaches this threshold value, the disease begins to recede (the infected population is decreasing). If at the outset of the epidemic, the susceptible population number is less than the threshold value, the disease will not invade. One can think of this intuitively as the point when the infectious disease "runs out of fuel" because the infected patients are recovering faster than the rate at which the susceptibles are contacting the disease.

If you play around with the value of \(\kappa\), you will see that if you increase this value, there will be less individuals infected and the contrary if you decrease \(\kappa\). This makes sense, as the larger the percentage of the infected population that recovers at each time period, the less the duration of the disease. What this means is that a fatal disease is much more dangerous if it kills slowly and those fatal diseases that kill (remove) their hosts quickly are likely to have a shorter life themselves7.

Notes

  1. There are indeed mortal diseases from which individuals do not recover. Still, recovered and deceased individuals should be dealt with differently in the model (and in real life), since the purpose of the model is to study the ocurrence of these two very different events.
  2. See BLOMHØJ et al.
  3. We hit upon the topic of parameter identifiability, which will be dealt with in a future post.
  4. See BEREC, p. 12.
  5. The data can be found in SULSKY.
  6. See CLARK.
  7. See "SIR Model of Epidemics" (TASSIER, pp. 5-6).

Bibliography



If you found this post interesting or useful, please share it on Google+, Facebook or Twitter so that others may find it too.

jueves, 18 de agosto de 2016

Resolución de problemas matemáticos (o ¿qué busco yo como asesor en una respuesta?)

En esta entrada trataré sobre un error muy común que surge en las respuestas de los exámenes semestre tras semestre. Este error aparece en varias materias que asesoro , bien sea Matemática I, II o III o las asignaturas de estadísticas y probabilidades, aunque los objetivos sobre inferencia estadística en este último grupo de asignaturas merece una ampliación aparte que daré en otro post. Directo al grano- el error se trata de la omisión del estudiante en preguntarse si efectivamente ha respondido la pregunta en el enunciado antes de dar su respuesta como definitiva. Este es un error que puede costarle el objetivo, como explicaré seguidamente.

domingo, 14 de agosto de 2016

How to simulate a bouncing ball in R - more on simecol events and external inputs


In this post I'll clarify the incorporation of external inputs in the dynamic model specification using simecol, again reviewing the relevant parts of the simecol simulation model for the Battle of Iwo Jima that I presented in the last post. I will also give an alternate specification for this model and compare both ways of specifying external inputs in view of execution times.

For the second part of this post, I'll be using another example - a really simple one of a bouncing ball - to illustrate on how to use events in your dynamic system models using simecol.



Iwo Jima revisited - how to specify external inputs for dynamic simecol models

You may recall the last post in which we implemented a model for the battle of Iwo Jima between the American (A) and Japanese (J) armies based on the following Lanchester equation:

\[\begin{align*}\dfrac{dA}{dt} &= f(t) - j \cdot J \cdot J \\ \dfrac{dJ}{dt} &= -a \cdot A \end{align*} \]
The external inputs function, representing the number of American reinforcements and the days when they were received, was the following:

\[f(t) = \left\{ \begin{array}{cl} 54000 & 0 \leq t \lt 1 \\ 0 & 1 \leq t \lt 2 \\ 6000 & 2 \leq t \lt 3 \\ 0 & 3 \leq t \lt 5 \\ 13000 & 5 \leq t \lt 6 \\ 0 & 6 \leq t \leq 36 \end{array} \right.\]
One aspect that would often confuse my students when I gave this class is the following: given the external inputs function above and the differential equation describing the change of American troops over time, does this mean, for example, that in each time instant from \(t=0\) to \(t \lt 1\) (there could be a great many deal of those), the Americans were receiving 54000 troops? Obviously not, but that question is answered if you look at the left-hand side of differential equation describing the change in American troops over time. For example, during the first day (\(t\in[0,1)\)), the change in the number of American troops at each infinitesimally small time step would be \(dA=\left(54000 - j\cdot J\right)\cdot dt\). Notice that while the 54000 increment in troops within the parentheses in the right hand expression looks like an awful lot, it becomes a small increment when multiplied by an infinitesimally small time step \(dt\).

We specified this external inputs function in the "inputs" slot of the iwojima odeModel, which I transcribe here once more for review:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
iwojima1 <- new("odeModel",
    main = function (time, init, parms, ...) {
        x <- init
        p <- parms
        f <- approxTime1(inputs, time, rule = 2)["reinforcements"]
        dame <- f - p["j"] * x["jap"]
        djap <- - p["a"] * x["ame"]
        list(c(dame, djap))},
    parms = c(j=0.0577, a=0.0106),
    times = c(from=0, to=36, by=0.01),
    init = c(ame=0, jap=21500),
    inputs = as.matrix(data.frame(
                day = c(0, 0.999, 1, 1.999, 2, 2.999, 3, 4.999, 
                        5, 5.999, 6, 36),
                reinforcements = c(54000, 54000, 0, 0, 6000, 6000, 0, 0, 
                            13000, 13000, 0, 0))),
    solver = "lsoda") 

In the code above, the input slot given in lines 12 to 16 of the code above contains a data frame with two columns: "day" and "reinforcements". We can see that this somehow maps to the reinforcements function of the iwojima model, where, for example, we had 54000 reinforcements coming in from \(t=0\) to \(t=1\), but why do we include the function value at time instants like 0.999, 1.999, 2.999, 4.999 and 5.999?

The reason for doing this is that we shall linearly interpolate this "function" (which is not a function, but a data frame really). So anywhere from \(t=0\) to \(t=0.999\), the function's value will be \(f(t)=54000\), but at \(t=0.9995\), for example, the value would be halfway between 54000 and 0, that is, \(f(0.9995)=27000\). Thereafter, it quickly drops to 0, since there were no reinforcements for the next day.

The data frame is linearly interpolated by the approxTime1 function in line 5 of the code above. This simecol function takes as arguments the time (the time instant as per the simulation clock) and inputs data frame given in lines 12 to 16. The rule argument is 2 because if time is anywhere outside of the two extreme time instants being considered, we want it to be set to the value at the closest time extreme (see the help file for approxTime1), although this is not necessary since we won't be running the simulation past day 36. By the way, if we look at the help file for approxTime1, we read that although this function is less flexible than approxTime as it takes only a single value and not a vector for the xout argument, it is more efficient in terms of execution time if the x argument is a data frame.

We could alternatively specify the iwojima model as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
iwojima2 <- new("odeModel",
    main = function (time, init, parms, ...) {
        x <- init
        p <- parms
        dame <- f(time) - p["j"] * x["jap"]
        djap <- - p["a"] * x["ame"]
        list(c(dame, djap)) },
    equations = list(
        f = function(t) {
            if (t<1) return(54000)
            else
              if (t<2) return(0)
              else
                if (t<3) return(6000)
                else
                  if (t<5) return(0)
                  else
                    if (t<6) return(13000)
                    else return(0)}), 
    parms = c(j=0.0577, a=0.0106),
    times = c(from=0, to=36, by=0.01),
    init = c(ame=0, jap=21500),
    solver="lsoda")

In this second version of the iwojima odeModel object, we include the external inputs function in the equations slot and not in the inputs slot as we had previously done. This second version, although it has more lines of code, is more tractable to our intuitive grasp, since the reinforcement function is defined as we would normally define a step-wise function in R and not by some roundabout process such as defining a data frame with weird time-instant specifications that we later interpolate linearly. However, the crucial point is not whether the code looks prettier or not, but which model specification is more time-efficient in terms of execution. The reason we are concerned about this is that since we are to fit the model's parameters, this will involve repeated simulations of our model done by the simecol fitOdemodel function. Using the following code we clock both fittings:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#we read the historical data...
obs <- read.csv2("iwo_jima.csv")
obsv <- which(!(is.na(obs$jap)))
#the weightdf dataframe assigns equal weight to all observations,
#except for the NA values in the japanese column (those get 0 weight)
weightdf <- data.frame(ame=rep(1, nrow(obs)), jap = rep(0, nrow=(obs)))
weightdf[1,"jap"] <- 1
weightdf[nrow(obs),"jap"] <- 1
#and now we fit the model parameters with simecol's fitOdeModel
#including our weightdf as weights makes the PORT routine faster
system.time(
  result_fit1 <- fitOdeModel(iwojima1,
    obstime=obs$time,
    yobs=obs[2:3],
    fn=ssqOdeModel,
    weights=weightdf,
    method="PORT",
    scale=c(1/0.1, 1/0.01),
    lower=c(j=0,a=0))
)
system.time(
  result_fit2 <- fitOdeModel(iwojima2,
    obstime=obs$time,
    yobs=obs[2:3],
    fn=ssqOdeModel,
    weights=weightdf,
    method="PORT",
    scale=c(1/0.1, 1/0.01),
    lower=c(j=0,a=0))
)

The execution time for fitting the iwojima1 model (using the interpolated inputs data frame) is 173.328 seconds on my CPU, while that of fitting the iwojima2 model (using a stepwise function definition in the equations slot) is ... 24.312 seconds! So in this particular case, pretty and efficient coincide1. Another surprising result is that the fits were not quite the same. For the iwojima1 model, the sum of squares is 0.05121009, while for the iwojima2 model, it was slightly higher at 0.05363194. I don't know if this is always the case (using the interpolated inputs data frame results in a more accurate fit to the data). More testing would have to be done to see if the gain in goodness of fit is significant and if it justifies the much slower execution time. But then again, we have not taken into consideration the problem of overfitting to the observed data2, which would further complicate this discussion.

Using simecol to simulate a bouncing ball

The reason we're using a simecol simulation of a bouncing ball as an example is to illustrate another characteristic of odeModel objects in simecol - simulation events, which occur whenever the value of a state variable changes abruptly in a manner not stipulated by the differential equations dictating the internal dynamics of the system. In a certain sense, the external inputs function considered previously can be thought of as an event. However, in the case of the bouncing ball, we have another type of event that occurs when the ball hits the ground and bounces back up, thereby changing the sign of its velocity. Of course, for added realism, we not only change the sign of the velocity, but at each bounce, we multiply the velocity by 0.9 to account for the loss of kinetic energy due to friction3, so that eventually the ball will settle on the ground. We illustrate how this is done in the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#------------------------------------------------------------------
# Bouncing ball simulation using simecol
# Thomas Petzoldt - 04/09/2010
#------------------------------------------------------------------

library(simecol)
ballode <- new("odeModel",
    main = function (t, y, parms) {
        dy1 <- y[2]
        dy2 <- -9.8
        list(c(dy1, dy2))   },
    parms = 0,
    times = seq(0, 37.5, 0.25),
    init = c(height = 20, v = 0),
    solver = function(t, y, func, parms) {
        root <- function(t, y, parms) y[1]
        event <- function(t, y, parms) {
            y[1] <- 0
            y[2] <- -0.9 * y[2]
            return(y)   }
      lsodar(t, y, func, parms = NULL,events = list(func = event, 
        root = TRUE), rootfun = root)}
)

ballode <- sim(ballode)

# Animation
o <- out(ballode)
png("ball%04d.png", width=300, height=300)
for (i in 1:length(times(ballode)))
    plot(1, o[i, 2], main="", ylab="", xlab="", xaxt="n",
       col="red", cex=2, pch=16, ylim=c(0,20))
graphics.off()
#Convert the *.png image sequence into an animate .gif file
#(Note: requires the ImageMagick command line programs)
system("convert -delay 5 *.png bouncing_ball.gif")

I would first like to comment that while we have only one apparent state variable (the height) and the rest would just be the first and second derivatives of that variable (accounting for the velocity and acceleration of the ball, respectively), in the odeModel specification we work with two state variables: the height and the vertical velocity. You can see that in line 9 of the code, the height increment is the velocity, while in line 10, the velocity increment is the acceleration, which is a constant equal to \(9.8\,\tfrac{m}{s^2}\).

The new action occurs within the solver slot specification of the odeModel. Instead of using rk4, lsoda or some such predefined integrator function in that slot, we define our own function to include a root function that is triggered when an event occurs. You see, the event "ball hits the ground" can be characterised mathematically as the point in which the height state variable equals zero4, so the root function would be the height state variable itself (in line 16 of the previous code). In the event function (lines 17 to 20), we simply set the height to zero and change the direction of the bouncing ball by multiplying the velocity by a negative number (again, differing discussion of the 0.9 factor for later in this post). Finally, we use lsodar as our solver, which is a essentially the same lsoda integration routine with added capabilities for dealing with events given by root functions. In the lsodar function call, we pass our root function and event functions as a list. We could add more events (for example, if we had a ceiling and had to deal with the ball bouncing off the ceiling) by simply adding more root/event function pairs in that list.

And now, Let's Get Physical5

This bouncing ball example is a perfect pretext to use R/simecol simulations in the physics classroom. Imagine we had no knowledge of Newtonian physics but we had a fairly good grasp of the concepts of velocity and positions as variables in a physical model. Instead of presenting the simulation model directly, we could first raise a discussion about the different state variables and the differential equations describing their behavior, after which we would arive at something like the speed/aceleration model above. Instead of using the predetermined 9.8 constant as aceleration, we could infer this ourselves from observational data by using the fitOdeModel simecol function, but first, we'd have to get some data...


At this point, I'd invite a student who's good at video editing to make a movie of a bouncing ball side-by-side with a ruler to measure its height. I would ask that student to obtain a sample of png frames from that video - you don't need all the frames, just some. If filenames of those png frames have numbers identifying the frame number in the movie, then obtaining the time instants is as easy as dividing the frame number by 30 (if the movie is 30 frames per second, example). To obtain the height measurements, I would ask another student to visually check the height of the ball against the ruler in each frame. Tabulating these values in a column of time (in seconds) and a column for height (converting the measurements into meters to keep consistent with the MKS measurement system) would give us our data frame of observations against which we would fit the model.

We could then adjust the gravitational aceleration constant and verify if it's really close to 9.8. While we're at it, we could also include a parameter for the 0.9 value corresponding to the bounce-back kinetic energy loss and set it initially to 1, to verify what it's real value is. In physics, this constant would correspond to the elasticity of the ball and the amount of ball surface-area that touches the ground when the ball is deformed while bouncing. Most kinetic-energy loss, I would observe, is due to friction and in friction energy is lost as heat, but we could have some discussion in class about that.

If we wanted to get more sophisticated in our modeling, we could also include a component to model the air-drag of the ball in its trajectory up and down. This would be a realistic addition to the model, as our bouncing ball was most likely not filmed in a vaccum. All of these additions to the models would entail re-fitting the model to the data obtained from the film. At each new fit, we could observe the goodness of fit sum of square values to see what accuracy we gained in our model.

If I was a physics teacher (which I'm not), this is the sort of thing I'd like to do in class. If system dynamic modeling principles were to be taught across the curriculum, students could just apply this to the study of physics, biology, social studies and other topics to deduce and re-invent by themselves all these laws and principles which they would otherwise have to memorize. This, no doubt, would make the learning more significant and meaningful. The great Isaac Newton, whom we could probably consider the great grand-father of System Dynamics, is quoted as saying that if he could see farther than others, it was because he stood on the shoulders of giants. What fun Newton would have had if he had a computer with R and simecol to play with! Then again, maybe its better that way because he wouldn't have invented Differential Calculus6.

Notes

  1. As a general rule, more elegant and nicer looking code is more efficient in terms of execution time.
  2. Overfitting is an important topic in the machine learning context. When we fit a model to observed data, we want the in-sample error to be as small as possible. One way to achieve this is by having more complex models incorporating a greater number of parameters. However, there's a penalty to pay for this, because as we incorporate more parameters into our model, we will almost surely fit the errors that are present in our sample of observations accounting for the peculiarities of that particular sample. In other words, overfitting occurs when we fit the noise in our data sample due to having more parameters in our model and this, while it guarantees a lower in-sample error, will almost surely result in a larger out-of-sample error, which is the error we're really interested in reducing. This is known as the principle of Occam's Razor.
    In our case, we don't have a different number of model parameters in either model and the observed differences in goodness of fit may not even constitute a case of overfitting at all. But if we think about it, the linear interpolation of the data frame mechanism would be more true to life to what occurred in the battle field. In the stepwise function implementation (using the equations slot), you suddenly have the stream of incoming reinforcements cut short to 0 as the simulation transitions from \(t\lt1\) to \(t\geq 1\), whereas in the linear interpolation of the data frame implementation, you have a more gradual reduction in the reinforcement stream. The former, being more true to real-life conditions, is probably why we got a better goodness of fit. I hope you followed me...
  3. We will later in this post go into details as to the reasons for modeling the loss of kinetic energy by multiplying the velocity by a factor of 0.9
  4. ... Or is almost equal to zero, depending on the error tolerance you define (or predefine in the lsodar function call).
  5. Sorry about the cheesy reference to that Olivia Newton-John 1980's song. Anyhow, when looking up the Wikipedia article on her while writing this note, I discovered two interesting facts about her ancestry. On the maternal side, she is descended from Max Born, the famous Nobel Prize atomic physicist. Olivia Newton-John's father was involved in the Enigma project at Bletchley Park, in which the allied mathematicians cracked the Nazi ciphers in World War II. Both facts are tangent on the sort of "physical" that I want to talk about in this section, so let's get physical...
  6. The quote that Newton (not to be confused with Newton-John of "Physical"-fame) phrased in English - nanos gigantum humeris insidentes - is actually ascribed to Bernard de Chartres in the 12th Century. As to the invention of Differential Calculus, there's also a view according to which Leibniz invented it independently, but this is too great of a dispute to go into in this footnote.

Bibliography



If you found this post interesting or useful, please share it on Google+, Facebook or Twitter so that others may find it too.