Blog of Andrés Aravena
Course Homepage:

# Computing in Molecular Biology 2

05 April 2021

This course is new, different from the previous year.

## Things to do:

This course is an introduction to Computational Thinking. We will use the tools we learned in the previous course and apply them to model and simulate scientific experiments as a way to understand them.

# Homework

All quizzes and homework should be sent to (andres.aravena+cmb@istanbul.edu.tr) before the deadline to get a grade. Please be careful; otherwise you will get a grade of zero.

• Homework 1 (Deadline: Wednesday 31 of March at 23:59).
Write to learn.
• Homework 2 (Deadline: Friday 19 of March at 8:59).
Teach the computer how to write the flag of Turkey.
• Homework 3 (Deadline: Friday 26 of March at 8:59).
Practice writing functions and applying them to several elements of a list.
• Homework 4 (Deadline: Friday 2 of April at 8:59).
Practice writing functions, for loops and conditional blocks.
• Homework 5 (Deadline: Friday 9 of April at 9:00).
Count rabbits with recursive functions and with a loop.
• Homework 6 (Deadline: Friday 16 of April at 9:00).
Simulate a predator-pray system, using Lotka-Volterra model.
• Homework 7 (Deadline: Monday 31 of May at 9:00).
Practice of Montecarlo method. Simulate “complex” random systems by decomposing them into simpler ones.
• Homework 8 (Deadline: Friday 4 of June at 9:00).
What will be your score in the exam? Use simulation to see what will probably happen.
• Homework 9 (Deadline: Friday 11 of June at 9:00).
We cannot predict the future, but we can make educated guesses. Practice educating your guesses.

# Classes

Here you will find the slides from the classes and other supplementary material. Notice that some things are said but not written, so you better take good notes. We recommend taking notes with pen and paper using the Cornell Method.

• Class 1: Introduction to Computational Thinking. (Mar 12, 2021). Motivation of the course. Learn how to solve hard problems. Get a super-power. [Video],[Slides].
• Class 2: Handling DNA in the computer. (Mar 12, 2021). Proteins and DNA are sequences, that can easily be handled by the computer. Learn how to find them on the web, by accession number or taxonomic id. We use FASTA format and we read them in R. [Video],[Slides]. See also:
• Class 3: Lists with names, and CG content. (Mar 19, 2021). Review of Quiz 1. The final discussion of lists. Implementing our ideas in R code. Solving some issues. [Video],[Slides].
• Class 4: GC content of all genes. Patterns and Abstraction. (Mar 19, 2021). How to do the same thing again and again, without getting tired or bored. Writing functions and applying them to vectors and lists. [Video],[Slides].
• Class 5: Practice with sapply. The FOR loop. (Mar 19, 2021). Two ways to repeat code. GC content and GC skew. These are the complete slides we wrote during the practice session. [Video],[Slides].
• Class 6: Local DNA statistics. (Mar 26, 2021). Sliding Windows. [Video],[Slides].
• Class 7: Making decisions. (Mar 26, 2021). Practice using IF-THEN-ELSE. Finding the smallest value. [Video],[Slides].
• Class 8: Patterns in patterns in patterns…. (Mar 26, 2021). In order to understand recursion, one must first understand recursion. [Video],[Slides].
• Class 9: Finding the replication origin. (Apr 2, 2021). GC skew points us in the right direction, but it is not easy. Accumulative sums and which.max help a lot. [Video],[Slides].
• Class 10: Accumulative sums and Systems. (Apr 2, 2021). This is one of the important ideas of the course. To understand complex things, we decompose them into interconnected parts. Complex behaviors can emerge from combining simple parts. Dumb ants can make a smart ant colony. [Video],[Slides].
• Class 11: Systems in Biology and Beyond. (Apr 9, 2021). We can describe systems as parts and interactions, and simulate their emergent behavior. [Slides]. See also:
• Class 12: Exercises on Systems. (Apr 9, 2021). We can describe systems as parts and interactions, and simulate their emergent behavior. [Slides]. See also:
• Class 13: Long-term behavior and effect of initial conditions. (Apr 16, 2021). We can describe systems as parts and interactions, and simulate their emergent behavior. [Slides].
• Class 14: Can we predict the future?. (Apr 30, 2021). Dynamic systems can be deterministic yet unpredictable. [Slides]. See also:
• Class 15: Probabilities. (Apr 30, 2021). People think that probabilities are about games. Instead, they are tools for thinking. Thinking about decisions when we have incomplete information. Thinking about the future. Thinking about the meaning of our experiment’s results. [Slides].
• Class 16: Easy and Hard problems. (Apr 30, 2021). Easy problems are “going downhill”, and hard ones are “uphill”. Why it is safe to use online banking. [Slides].
• Class 17: Virtual experiments. (May 7, 2021). What can happen? Simulating random systems [Slides].
• Class 18: Complex random systems. (May 7, 2021). What can happen? Simulating random systems [Slides].
• Class 19: Comments on Midterm Exam. (May 21, 2021). What can happen? Simulating random systems [Slides].
• Class 20: Population and Samples. (May 21, 2021). We want to know a big population, but we can only observe a small sample. How are they related? [Slides].
• Class 21: Central Limit Theorem. (May 28, 2021). We want to know a big population, but we can only observe a small sample. How are they related? [Slides].
• Class 22: Confidence Intervals. (May 28, 2021). We want to know a big population, but we can only observe a small sample. How are they related? [Slides].
• Class 23: Things that you must know. (Jun 4, 2021). If you do not know this, you will probably fail the course. [Slides].
• Class 24: Practice. (Jun 4, 2021). Makes perfection [Slides].

## Other reading material for classes

Everybody must read and understand the following texts:

# Sequences for exercises

Most times you will use sequences that we find at NCBI. For exercises, we can use these sequences:

• Candidatus Carsonella ruddii PV DNA
• Escherichia coli str. K-12 substr. MG1655

# Required software

For this course we will use the new version of R and Rstudio. These two tools work together. Install R first, then install Rstudio.