Problem 0 - Replication: DNA polymerase is an enzyme that can synthesize DNA molecules from nucleotides.

computer science


Problem 0 - Replication: DNA polymerase is an enzyme that can synthesize DNA molecules from nucleotides. They are essential in DNA replication, for example during cell division. They read the existing DNA and create new copies of double stranded DNA. In particular, DNA polymerase reads a DNA strand in reverse (3' to 5'), and synthesizes a complementary strand of DNA (along the 5' to 3' direction of the newly synthesized DNA). In short, it produces the reverse complement of the original. 

(5 points) Write a simple function that accepts a DNA sequence and returns its reverse complement. The input should be a string and so should be the output (This applies generally throughout this project, unless mentioned otherwise). Write in two lines for full credit.


Problem 1 - Transcription and Translation: As we discussed in class, the central dogma of biology is the transcription of DNA into RNA followed by translation of RNA into proteins. During transciption, RNA polymerase traverses a DNA template strand in reverse, and synthesizes a messenger RNA (mRNA) strand that is complementary to the template's DNA. Then, the mRNA is translated by the ribosome into protein. We will write functions to emulate this process.

(5 points) Write a function that transcribes a complenetary DNA strand of a gene into RNA.  Write in two lines for full credit

 (5 points) Note that Transcribe and RevComp functions do almost the same work. Write a single line transcription function that uses RevComp.

 (10 points) Going back to the central dogma model, we have transcribed the DNA into mRNA, what remains now is to translate the mRNA into proteins. 

Analogous to how a piece of code you write makes the computer produce some output, DNA served as the code for a cell to produce mRNA. Now, mRNA serves as the code for the cell to produce proteins as output. Proteins are made up of amino acids. So cells must interpret the code in mRNA appropriately to produce amino acids, which later combine to form proteins. For this, a cell reads the nucleotides in mRNA in groups of three, which are called codons. The genetic code for the production of amino acids is provided below as a dictionary. Each key is a codon, which is interpreted to produce the amino acid indicated by the corresponding value. So for example the codon 'AAA' will produce the amino acid denoted by 'K' (Lysine). The cell will stop translation when it encounters a stop codon. In the following dictionary the stop codons are associated with value of 'X'. The start codon, AUG, indicates where translation should start (we won't code anything for this here, but it's good to know). 

For this problem, fill in the missing line below to make a function that translates the mRNA to protein. Note: 1) your function should return an error if the length of input is not a multiple of three, 2) your function should not produce any more amino acids once a stop codon is found.

 (5 points) The mRNA strand is the reverse complement of the template strand of DNA, which is itself the reverse complement of the coding strand. The function above translates from mRNA to protein; below, write a function that takes a DNA coding strand as input and returns the protein. The same notes as above apply here as well. The strands below demonstrate useful terminology for understanding the remaining problems:

Related Questions in computer science category