Blog of Andrés Aravena
Bioinfo:

18 October 2019

# Relationships

One important concept in bioinformatics, and in science in general, is the idea of a relationship. That is, a rule ⧋ that takes two objects, x and y, and gives back a True or False answer, represented by x⧋y. For example, some relationships are:

• x is greater than y
• x is the double of y
• x is the father of y
• x is classmate of y
• x is on the same taxonomic genus as y

If a relationship ⧋ is reflexive, symmetrical and transitive, then we say that ⧋ is an equivalence relationship. That is

• Reflexivity: For all x, x⧋x is True.
• Symmetry: For all x and all y, x⧋y is the same as y⧋x
• Transitivity: For all x, y and z, if x⧋y and y⧋z, then x⧋z

On the other hand, if the relationship is not symmetrical, and instead it is anti-symmetrical, then we say that ⧋ is an order relationship. That is

• Symmetry: For all x and all y, if x⧋y is true, and y⧋x is true, then x=y, that is x and y are the same.

Please give a list of relationships, indicting if they are equivalence, order or not. The goal of this question is to enhance your observation skills, so we aim for quantity and originality. The score of the answer will be the number of relationships that are reported by only one person.

# Distances

1. What is the difference between the Hamming distance and Edit distance
2. Show that the relationship “distance between x and y is zero” is an equivalence relationship
3. Show that the relationship “distance between x and y is small” is not an equivalence relationship

# Alignment

1. What is the difference between global and local alignment?
2. What is the difference between semi-global and local alignment?
3. What is the ideal use case for global alignment?
4. What is the ideal use case for semi-global alignment?
5. What is the ideal use case for local alignment?
6. Two bacterial strains of the same species have the same genes but they may be in different order. How would you test this hypothesis?
7. How can you calculate the percentage of nucleotides conserved between the same two bacterial strains of the previous question?

# Scoring

1. What is the role of λ in the scoring of alignments?
2. In DNA, why the match score is positive, and mismatch score is negative?
3. In proteins, why some substitution scores are positive and others are negative? What is the biological interpretation of positive scores?
4. Why gap score have two parts: existence and extension?
5. Why we need that the gap score be lower than the substitution score?

# Computational cost

1. What is computational cost?
2. What is the computational cost of the Smith-Waterman algorithm?
3. What is an heuristic?
4. What is the strategy that BLAST uses to speed up the local alignment?
5. What is the trade-off of the word size parameter in BLAST?
Originally published at https://anaraven.bitbucket.io/blog/2019/bioinfo/homework-3.html