Blog of Andrés Aravena
CMB2:

# The Biologist Toolbox: Simulating Systems on the computer

04 April 2019

Once we have a description of a system, and a nice drawing to represent it, we can answer some interesting questions. The first question is usually “what will happen?” In other words, we usually want to know what is the behavior of the system.

The second question we usually ask is one of the two following: “how does the behavior change when the initial values change?”, or “how does the behavior change when the processes rates change?”

To answer these questions, we can use the computer. It is not hard to write a small program to simulate the system. The simulation will be an approximate answer, good enough to see the important characteristics of the system’s behavior.

In this document we will consider systems represented by graphs like in the last post “Drawing Systems”, and we will write our computer code using the R language.

## Discrete time

We will represent time in the computer by an integer that increases step by stepIn other words, we use discrete time.

. The units are arbitrary, so we can think of anything between microseconds and millennia, including years, days, hours, seconds, weeks, etc. We only assume that each time-step has the same duration.

For example in our case we can think that each time-step is one hour. Time 1 corresponds to the initial condition, that is, the beginning of the experiment. Time 2 is one hour later. Time 25 is one day later.

We can use any symbol to represent time. Most people uses letters i, j or k, since these are the letters traditionally used as index of vectors and matrices. In this example we are going to represent time with i.

## The nodes have values

Figure 1. A model of the cell growth system. For this example we will use the system represented by the graph of Figure 1.

In this graph each circle represents an item of the system. The label of the circle is the name of the item. In our example we have two circles, named cells and food.

Each circle has also a value that can change through time, usually the quantity or concentration of the item in the system. The quantity of each item at different times is represented by a vector. The name of the vector must be the same name written in the circle. In our example, the vectors are cells and food.

Each item has also a second value, representing how much does the item’s quantity changes in each time-step. The name of these value starts with d_, followed by the item’s name. In our example these values are called d_cells and d_food. We read them delta cells and delta food.

Thus, the items of the system have two vectors. Each vector has one element for each time-step. So the number of cells at time i is cells[i] and the value of its change at the same time is d_cells[i].

The processes are represented by boxes and are much simpler. They only have one constant value, which is a rate. The value name is made with the label of the box and the suffix _rate. So the box called replication has a constant named replication_rate.

In summary, each circle has two vectors, and each box has only one fixed number.

## Finding the equations

To simulate the system we need to find the formulas. We start with the boxes, because they are the easy ones.

For each box in the graph we get a single term: the multiplication of the rate constant and each of the quantity variables of the circles connected by incoming arrows. If there are several incoming arrows from the same circle, then the variable is multiplied several times. The outgoing arrows are not important in this part.

In our example the formula for replication box is replication_rate * cells[i-i] * food[i-1]. Here we use the index i-1 because we do not know yet the value of cells[i] or food[i]. We will calculate them later. Metaphorically, at the begin of “today” we only know “yesterday”.

The formula for the circles is made with the sum of all the terms of boxes connected with incoming arrows, minus all the terms of boxes connected with outgoing arrows. This value is assigned to the change variable of the circle.

In our example the variable d_food[i] gets assigned the value -replication_rate * cells[i-i] * food[i-1], since the food circle has only one outgoing arrow. The cell circle has two incoming arrows from replication and one outgoing arrow to replication. Therefore the variable d_cells[i] gets the value

replication_rate * cells[i-i] * food[i-1]
+ replication_rate * cells[i-i] * food[i-1]
- replication_rate * cells[i-i] * food[i-1]

which, after simplification, is replication_rate * cells[i-i] * food[i-1] resulting on a final result of one positive incoming arrow. In summary, the formulas for the change variables are:

d_food[i]  <- -replication_rate * cells[i-i] * food[i-1]
d_cells[i] <-  replication_rate * cells[i-i] * food[i-1]

## Cummulative values

Now the quantity variables have to be updated. Each quantity variable is the cumulative sum of the change variables. In our example we have

food[i]  <- food[i-1]  + d_food[i]
cells[i] <- cells[i-1] + d_cells[i]

## Initial conditions

The last missing piece necessary for the simulation are the initial values of the circles’ variables. The value of cells[today] depends on the value of cells[yesterday], and we can only calculate that for time i greater than or equal to 2.

In our case we will use the variables cells_ini and food_ini to store the values corresponding to cells[1] and food[1]. For the change variables, we can assume that they are initially zero.

## Input parameters

Our simulation will be a function. The inputs are:

• N, the number of simulation steps,
• replication_rate, the rate of change for the given time
• cells_ini, the initial quantity of cells
• food_ini, the initial quantity of food

and the output will be a data frame with the vectors cells and food.

## Complete code

We know that the R code should be something like this:

cell_culture <- function(N, replication_rate, cells_ini, food_ini) {
# create empty vectors
# initialize values
for(i in 2:N) {
# update d_cells and d_food
# update cells and food as a cumulative sum
}
return(data.frame(cells, food))
}

To create an empty vector in R we only need to know the vector sizeStrictly speaking, we do not need the exact vector size, but the program is faster if we use the correct number.

, which in this case is N. Therefore we write

# create empty vectors
cells <- d_cells <- rep(NA, N)
food  <- d_food  <- rep(NA, N)

Here we use a trick in R that allows us to assign the same value to more than one variable at the same time. This line:

cells <- d_cells <- rep(NA, N)

is equivalent to these two lines:

d_cells <- rep(NA, N)
cells <- rep(NA, N)

The first component of each vector needs an initial value. Our function receives cells_ini and food_ini as inputs. To make thing simple we assume that the change values start at zero, so we write

cells[1]   <- cells_ini
food[1]    <- food_ini
d_cells[1] <- d_food[1] <- 0

## All together

Putting all together, we get

cell_culture <- function(N, replication_rate, cells_ini, food_ini) {
# create empty vectors
cells <- d_cells <- rep(NA, N) # same name as the item
food  <- d_food  <- rep(NA, N)
# initialize values
cells[1]   <- cells_ini
food[1]    <- food_ini
d_cells[1] <- d_food[1] <- 0
for(i in 2:N) { # always the same for loop
# Only this part depends on the arrows
d_cells[i] <- +replication_rate * cells[i-1] * food[i-1]
d_food[i]  <- -replication_rate * cells[i-1] * food[i-1]
# this is always the same
cells[i] <- cells[i-1] + d_cells[i]
food[i]  <- food[i-1]  + d_food[i]
}
return(data.frame(cells, food))
}

# Other examples

This modeling and simulation technique is not limited to a particular field of science. The same rules apply to may other systems, small and big. That is why we can use examples from very different areas and still learn something useful for out own interest. Let’s see other examples

## Water formation

Figure 2. A model of the water formation reaction.

A chemical reaction combines hydrogen and oxygen to produce water. To simplify we assume that the reaction is this: $2H+O\leftrightarrow H_2O$

This system has 3 items: Hydrogen, Oxygen and Water, represented with the letters H, O and W. This reaction is reversible, so we represent it with two irreversible reactions at the same time: water formation (r_1) and water decomposition (r_2). We can represent this system with the graph in Figure 2. The R code to simulate this system is:

water_formation <- function(N, r1_rate=0.1,  r2_rate=0.1,
H_ini=1, O_ini=1, W_ini=0) {
# first, create empty vectors to fill later
W <- d_W <- rep(NA, N) # Water, quantity and change on each time
H <- d_H <- rep(NA, N) # Hydrogen
O <- d_O <- rep(NA, N) # Oxygen
# fill the initial quantities of water, hydrogen and oxygen
W[1] <- W_ini
H[1] <- H_ini
O[1] <- O_ini
d_W[1] <- d_H[1] <- d_O[1] <- 0 # the initial change is zero

for(i in 2:N) {
d_W[i] <-    r1_rate*H[i-1]*H[i-1]*O[i-1] - r2_rate*W[i-1]
d_O[i] <-   -r1_rate*H[i-1]*H[i-1]*O[i-1] + r2_rate*W[i-1]
d_H[i] <- -2*r1_rate*H[i-1]*H[i-1]*O[i-1] + 2*r2_rate*W[i-1]

W[i] <- W[i-1] + d_W[i]
O[i] <- O[i-1] + d_O[i]
H[i] <- H[i-1] + d_H[i]
}
return(data.frame(W, H, O))
}

## Extended model for water

Some sharp eyes noticed that the water creation reaction has an intermediate step. One hydrogen and the oxygen combine to produce an hydroxyl molecule. Then this compound combines with the second hydrogen to make water. All reactions are reversible. The system would look like this:

Can you make the simulation of this system?

## Predator-Prey population dynamics

Figure 3. A model of the Lotka-Volterra system.

This is a system with several applications, known as “Lotka-Volterra system”. It was discovered and first analyzed by Alfred J. Lotka, and Vito Volterra, and appears again and again in ecology and population dynamics.

Here we have two items: cats and rats, and three processes: birth (of rats), catch (of rats, also reproduction of cats) and death (of cats). The system is represented by the graph in Figure 3 and simulated by the following R code:

cats_and_rats <- function(N, birth_rate, catch_rate, death_rate) {
rats <- d_rats <- rep(NA, N)
cats <- d_cats <- rep(NA, N)
rats[1] <- 1
cats[1] <- 3
d_rats[1] <- d_cats[1] <- 0

for(i in 2:N){
d_rats[i] <-  birth_rate*rats[i-1] - cats[i-1]*rats[i-1]*catch_rate
d_cats[i] <- -death_rate*cats[i-1] + cats[i-1]*rats[i-1]*catch_rate
rats[i] <- rats[i-1] + d_rats[i]
cats[i] <- cats[i-1] + d_cats[i]
}
return(data.frame(rats, cats))
}

## Money in the bank

This one is an exercise. Consider the following system.

What will be the money in the bank after 12 month if the interest rate is 1/12? What if the interest is 1/100 and the total time is 100 months?

# References

Koch, Ina. “Petri Nets - A Mathematical Formalism to Analyze Chemical Reaction Networks.” Molecular Informatics 29, no. 12 (December 17, 2010): 838–43. doi:10.1002/minf.201000086.

Baez, John C., and Jacob Biamonte. “Quantum Techniques for Stochastic Mechanics,” September 17, 2012. http://arxiv.org/abs/1209.3632.