$25
Question 1: High-throughput methods
A. Briefly describe the in vitro experiments SELEX-seq and PBM and the in vivo experiment ChIP-seq.
B. Compare and discuss the advantage and disadvantage of these methods and what sort of biological questions they may be relevant to.
Question 2: Protein-DNA recognition
A. Briefly explain the physical mechanisms relevant for (a) base readout and (b) shape readout of DNA sequence by DNA binding proteins.
B. Download and install the PyMol plugin PDIviz, which is designed for visualizing protein-DNA interfaces. PDIviz can be found here:
http://melolab.org/mbl/pdiviz General instructions for installing PyMol plugins can be found here:
https://pymolwiki.org/index.php/Plugins#Installing_Plugins Using the plugin, visualize the structures (a) 1T2K and (b) 1KX5. Determine which regions of the DNA are contacted by the protein and explain any differences that you observe. Include the structure visualizations in your report.
C. Access the Proteopedia pages (a) http://www.proteopedia.org/w/Hox-DNA_Recognition and (b) http://www.proteopedia.org/w/P53-DNA_Recognition. Briefly explain for (a) Hox proteins and (b) p53 which structural features of the respective DNA binding sites are important for achieving binding specificity..
Question 3: Molecular Dynamics
In this question we’re going to implement a teeny tiny molecular dynamics simulation for a single particle in the potential 𝑉(𝑥) = 𝑥2. Molecular Dynamics is based on solving Newton’s equations of motion which relates the force on a particle to its acceleration:
𝐹(𝑥) = 𝑚𝑎
This equation says that the force acting on a particle (which is a function of its position 𝑥) is equal to the mass of the particle times its acceleration. Remember that acceleration is defined as:
𝑑2𝑥
𝑎 = 𝑑𝑡2 Therefore, we can write:
𝑑2𝑥 𝐹(𝑥) = 𝑚 𝑑𝑡2
Our goal is, given a force, to find 𝑥 as a function of time: 𝑥 = 𝑥(𝑡). To solve this equation numerically (i.e. with the aid of a computer), we can use what is known as Verlet integration. A full discussion of this is beyond the scope of this course, but just know that Verlet integration is a way to accurately solve Newton’s equation numerically (I encourage you to learn more on your own).
First, we have to discretize our position and time variables. This means that rather than treating them as continuous variables, we only consider them at particular fixed intervals. For example, we can have that 𝑡 = 0.0, 0.2, 0.4, 0.6, … (which corresponds to a time step Δ𝑡 = 0.2) and we would then want to figure out what the corresponding values of 𝑥 are at each time step, based on newton’s equation.
Verlet integration tells us that, given a time step Δ𝑡, (which means how far apart our time intervals are spaced), and a function which defines a force 𝐹, that
𝐹(𝑥𝑖−1) 2 (1)
𝑥𝑖 = 2𝑥𝑖−1– 𝑥𝑖−2 + 𝑚 Δ𝑡
Here, 𝑥𝑖 simply means the position of the particle at the ith time step. Please answer the following questions and include your answers in your report.
A. Using the potential 𝑉(𝑥) = 𝑥2, what is the corresponding force? Be sure you get the sign correct, or the later part of this question will not work!
B. Let’s do a quick practice run with Verlet integration. Using a constant force 𝐹 = −9.8, a mass 𝑚 = 1.0, and a time step Δ𝑡 = 0.1 complete the following table. Use equation (1) to update the position ( 𝑥𝑖 ) at each time step and fill out the table below. Include the completed table in your report.
𝑖
𝑥𝑖
𝑥𝑖−1
𝑥𝑖−2
𝐹
𝑡𝑖
2
9.951
10
-9.8
0
3
-9.8
4
-9.8
5
-9.8
C. Once you’ve figured out how to do part B, it’s trivial to pop in a different force, time step and initial conditions and solve Netwon’s equation for any system, including an entire protein structure! Of course, how good we can do depends on how accurate our force field is in the case of a protein, but let’s do a very simple example using a single particle in one dimension.
Create a similar table as in part B (or arrays/vectors if you’re using something like Python or R), but this time use the force you determined in part A (which is NOT constant). I have provided initial conditions below. Compute 𝑥𝑖 and 𝑡𝑖 for about 400 time steps, and then plot 𝑥𝑖 vs 𝑡𝑖 (𝑡𝑖 on the horizontal axis). Use 𝑚 = 1 and Δ𝑡 = 0.01. What kind of motion is the particle undergoing? Include your plot with proper labels. Do not include a table in your report.
𝑖
𝑥𝑖
𝑥𝑖−1
𝑥𝑖−2
𝐹
𝑡𝑖
2
0.002
0
-0.004
0.02
3
…
400
Question 4: Structure Determination
Let’s look at some X-ray crystallography data. I have provided three solved protein structures, a6.pdb, 06.pdb and 08.pdb and corresponding *.dsn6 files that contain electron densities for these proteins.
A. For each PDB file, open it in PyMol and then also open the corresponding
*.dsn6 file. You’ll be presented with a dialog window when you load the *.dsn6 file, make sure to check the isomesh box and leave everything else as-is.
Change the visualization type for the protein structure from “cartoon” to “lines”. Then, using a PyMol selection, select and zoom in on the residue ASN 132 for each structure, showing a clear view of the side chain and electron density. Include these visualizations in your report, making sure to label each one appropriately. Each of these structures was determined from a different electron density resolution. Based on what you observe of the electron density contours, order the structures for highest (best) resolution to lowest (worst) resolution.
B. You may notice that the density contour on the poorer resolution structures could leave some room for interpretation about the orientation of the ASN side-chain. How might a structural biologist resolve this issue?
C. For the a6 structure/electron density, have a look at the residue GLN 126. Zoom in on it and provide a visualization in your report. You may notice that there are two side chains for this residue! What is going on here, and what might be the cause of this?
Question 5: **BONUS**
In order to do this question, please install the following python modules in conda if not already installed:
conda install sklearn conda install jupyter
This question will use the file hw3.ipynb which is a python notebook. To open it, run jupyter notebook hw3.ipynb from a terminal window. This is an interactive python session, if you’re unfamiliar with how these work take a quick look here: https://towardsdatascience.com/a-beginners-tutorial-to-jupyternotebooks-1b2f8705888a
Follow along with the notebook and write the code as you go! This question is based on this paper: https://rohslab.usc.edu/Papers/2015_Pnas_Zhou.pdf so you may want to have a look at that for some more background. Your goal is to produce a plot similar to the one below (you’ll understand what this means when you go through the notebook). Note that the values here have been intentionally changed, you won’t get the same results.