$30
Problem 1: Sequence space and Hamming distance
Consider an alphabet A of size |A| = B. For a binary alphabet, one has A = {0,1} and B = 2, and for
DNA, one has A = {A,T,C,G} and B = 4. We are studying sequences S ∈ AL of length L. Assume
sequences are random with a uniform distribution,
(a) How many unique binary and DNA sequences exists for L = 28?
(b) What is the average Hamming distance between two random binary sequences? What is the
expected Hamming distance for two random DNA sequences?
(c) Given a binary sequence of length L, how many sequences exist at a Hamming distance three from
it? How many at distance K with K ≤ L? Repeat the calculation for DNA sequences. (
Consider the quasispecies equation with two genotypes 0,1 (i.e., binary sequences of length 1). Let
the fitness of genotype 0 be f0 > 1, and the fitness of genotype 1 be f1 = 1. Moreover, genotypes are
replicated error-free with probability q,
(a) Write down the mutation-selection matrix W and find its eigenvalues.
(b) To which eigenvalue corresponds the non-trivial equilibrium point?
(1 point)
Hint: Perron-Frobenius theorem.
(c) Examine the dynamics of the quasispecies equation and confirm the results obtained
in (b). Assume that q = 0.6 and f0 = 1.5, and initial condition (0.65,0.35). §
(d) What is the equilibrium point for f0 = f1 = 1?
(e) Calculate the equilibrium point in the limit of low mutation rate (q ≈ 1).
1