DNA structure and replication

Hank from Crash Course introduces that wondrous molecule deoxyribonucleic acid—also known as DNA—and explains how it replicates itself in our cells.

Video source: Crash Course / YouTube.

View Transcriptarrow

HANK GREEN, Narrator: It’s just beautiful, isn’t it? It’s just mesmerising. It’s double hel-exciting! You really can tell, just by looking at it, how important and amazing it is. It’s pretty much the most complicated molecule that exists, and potentially the most important one. It’s so complex that we didn’t even know for sure what it looked like until about 60 years ago. So multifariously awesome that if you took all of it from just one of our cells and untangled it, it would be taller than me. 

Now consider that there are probably 50 trillion cells in my body right now. Laid end to end, the DNA in those cells would stretch to the sun. Not once … but 600 times! 

Mind blown yet?

Hey, you wanna make one?

Of course you know I’m talking about deoxyribonucleic acid, known to its friends as DNA. DNA is what stores our genetic instructions—the information that programs all of our cell’s activities. It’s a 6 billion letter code that provides the assembly instructions for everything that you are. And it does the same thing for pretty much every other living thing. I’m going to go out on a limb and assume you’re human. In which case every body cell, or somatic cell in you right now, has 46 chromosomes, each containing one big DNA molecule.

These chromosomes are packed together tightly with proteins in the nucleus of the cells. 

DNA is nucleic acid. And so is its cousin, which we’ll also be talking about, ribonucleic acid, or RNA. 

Now if you can make your mind do this, remember all the way back to Episode 3, where we talked about all of the important biological molecules: carbohydrates, lipids and proteins. That ring a bell?

Well, nucleic acids are the fourth major group of biological molecules and, for my money, they have the most complicated job of all. 

Structurally they’re polymers, which means that each one is made up of many small, repeating molecular units. In DNA, these small units are called nucleotides. Link them together and you have yourself a polynucleotide. 

Now, before we actually put these tiny parts together to build a DNA molecule like some microscopic piece of IKEA furniture, let’s first take a look at what makes up each nucleotide. 

We’re gonna need three things:

  1. a five-carbon sugar molecule
  2. a phosphate group and
  3. one of four nitrogen bases. DNA gets the first part of its name from our first ingredient, the sugar molecule, which is called deoxyribose. But all the really significant stuff, the genetic coding that makes you YOU, is found among the four nitrogenous bases: adenine (A), thymine (T), cytosine (C) and guanine (G).

It’s important to note that in living organisms, DNA doesn’t exist as a single polynucleotide molecule, but rather a pair of molecules that are held tightly together. They’re like an intertwined, microscopic, double spiral staircase. Basically, just a ladder, but twisted. The famous double helix. And like any good structure, we have to have a main support. In DNA, the sugars and phosphates bond together to form twin backbones. These sugar–phosphate bonds run down each side of the helix but, chemically, in opposite directions. In other words, if you look at each of the sugar–phosphate backbones, you’ll see that one appears upside down in relation to the other. One strand begins at the top with the first phosphate connected to the sugar molecule’s 5th carbon and then ending where the next phosphate would go, with a free end at the sugar’s 3rd carbon. This creates a pattern called 5 prime and 3 prime. I’ve always thought of the deoxyribose with an arrow, with the oxygen as the point. It always ‘points’ from 3 prime to 5 prime. Now on the other strand, it’s exactly the opposite. It begins up top with a free end at the sugar’s 3rd carbon and the phosphates connect to the sugars’ fifth carbons all the way down. And it ends at the bottom with a phosphate. And you’ve probably figured this out already, but this is called the 3’ (3 prime) to 5’ (5 prime) direction. 

Now, it is time to make ourselves one of these famous double helices. 

These two long chains are linked together by the nitrogenous bases via relatively weak hydrogen bonds. But they can’t be just any pair of nitrogenous bases. Thankfully, when it comes to figuring out what part goes where, all you have to do is remember that if one nucleotide has an adenine base (A), only thymine (T) can be its counterpart (A-T). Likewise, guanine (G) can only bond with cytosine (C) (G-C). These bonded nitrogenous bases are called base pairs. The G-C pair has three hydrogen bonds, making it slightly stronger than the A-T base pair, which only has two. It’s the order of these four nucleobases or the base sequence that allows your DNA to create you. 

So, AGGTCCATG means something completely different as a base sequence than, say, TTCAGTCG. Human chromosome 1, the largest of all our chromosomes, contains a single molecule of DNA with 247 million base pairs. If you printed all of the letters of chromosome 12 into a book, it would be about 200,000 pages long. And each of your somatic cells has 46 DNA molecules tightly packed into its nucleus—that’s one for each of your chromosomes. Put all 46 molecules together and we’re talking about 6 billion base pairs … in every cell! 

This is the longest book that I’ve ever read. It’s about 100 pages long. If we were to fill it with our DNA sequence, we’d need about 10,000 of them to fit our entire genome. POP QUIZ!!!

Let’s test your skills using a very short strand of DNA. I’ll give you one base sequence—you give me the base sequence that appears on the other strand. 

Okay, here goes. 

So, we have a 5’-AGGTCCG-3’ … and …. Time’s up. The answer is: 3’-TCCAGGC-5’. 

See how that works? It’s not super complicated. Since each nitrogenous base only has one counterpart, you can use one base sequence to predict what its matching sequence is going to look like. So, could I make the same base sequence with a strand of that ‘other’ nucleic acid, RNA? No, you could not.

RNA is certainly similar to its cousin DNA—it has a sugar–phosphate backbone with nucleotide bases attached to it. But here are THREE major differences:  

  1. RNA is a single-stranded molecule—no double helix.
  2. The sugar in RNA is ribose, which has one more oxygen atom then deoxyribose, then the whole starting with an R instead of a D.
  3. Also, RNA does not contain thymine. Its fourth nucleotide is the base uracil, so it binds with adenine instead. 

RNA is super important in the production of our proteins, and you’ll see later that it has a crucial role in the replication of DNA.

But first … Biolo-graphies!

Yes, plural this week. Because when you start talking about something as multitudinously awesome and elegant as DNA, you have to wonder: just who figures all this stuff out? And how big was their brain? 

Well, unsurprisingly, it actually took a lot of different brains, in a lot of different countries and nearly a hundred years of thinking to do it. The names you usually hear when someone asked who discovered DNA are James Watson and Francis Crick. But that’s BUNK. They did not discover DNA, nor did they discover that DNA contained genetic information. 

DNA itself was discovered in 1869 by a Swiss biologist named Friedrich Miescher. His deal was studying white blood cells and he got those white blood cells in the most horrible way you could possibly imagine, from collecting used bandages from a nearby hospital. It’s for science he did it! He bathed the cells in warm alcohol to remove the lipids then he set enzymes loose on them to digest the proteins. What was left, after all that, was snotty grey stuff that he knew must be some new kind of biological substance. He called it nuclein, but it was later to become known as nucleic acid. But Meischer didn’t know what its role was or what it looked like. One of those scientists who helped figure that out was Rosalind Franklin, a young biophysicist in London nearly a hundred years later. Using a technique called X-ray diffraction, Franklin may have been the first to confirm the helical structure of DNA. She also figured out that the sugar–phosphate backbone existed on the outside of this structure. So why is Rosalind Franklin not exactly a household name? 

Two reasons: 

  1. Unlike Watson and Crick, Franklin was happy to share data with her rivals. It was Franklin who informed Watson and Crick that an earlier theory of a triple helix structure was not possible, and in doing so she indicated that DNA may indeed be a double helix. Later, her images confirming the helical structure of DNA were shown to Watson without her knowledge. Her work was eventually published in Nature, but not until after two papers by Watson and Crick had already appeared in which the duo only hinted at her contribution. 
  2. Even worse than that, the Nobel Prize Committee couldn’t even consider her for the prize that they awarded in 1962 because of how dead she was. The really tragic thing is that it’s totally possible that her scientific work may have led to her early death of ovarian cancer at the age of 37. At the time, the X-ray diffraction technology that she was using to photograph DNA required dangerous amounts of radiation exposure, and Franklin rarely took cautions to protect herself. Nobel prizes cannot be awarded posthumously. Many believe she would have shared Watson and Crick’s medal if she had been alive to receive it. 

Now that we know the basics of DNA’s structure, we need to understand how it copies itself, because cells are constantly dividing, and that requires a complete copy of all that DNA information. 

It turns out that our cells are extremely good at this—our cells can create the equivalent of 10,000 copies of this book in just a few hours. That, my friends, is called replication. 

Every cell in your body has a copy of the same DNA. It started from an original copy and it will copy itself trillions of times over the course of a lifetime, each time using half of the original DNA strand as a template to build a new molecule. 

So, how is a teenage boy like the enzyme helicase?

They both want to unzip your genes. 

Helicase is marvellous, unwinding the double helix at breakneck speeds, slicing open those loose hydrogen bonds between the base pairs. The point where the splitting starts is known as the replication fork, it has a top strand called the leading strand, or the good guy strand as I call it and another bottom strand called the lagging strand, which I like to call the scumbag strand, because it is a pain in the butt to deal with. These unwound sections can now be used as templates to create two complementary DNA strands. But remember the two strands go in opposite directions, in terms of their chemical structure, which means that making a new DNA strand for the leading strand is going to be much easier than for the lagging strand. For the leading, good guy, strand, an enzyme called DNA polymerase just adds matching nucleotides onto the main stem all the way down the molecule. But before it can do that it needs selection of a section of nucleotides that fill in the section that’s just been unzipped. Starting at the very beginning of the DNA molecule, DNA polymerase needs a bit of a primer, just a little thing for it to hook on to so that it can start building the new DNA chain. And for that little primer, we can thank the enzyme RNA primase. The leading strand only needs this RNA primer once at the very beginning. Then DNA polymerase is all, ‘I got this’, and it just follows the unzipping, adding new nucleotides to the new chain continuously, all the way down the molecule. 

Copying the lagging, or scumbag strand, is, well, he’s a freaking scumbag. 

This is because DNA polymerase can only copy strands in the 5’-3’ direction, and the lagging strand is 3’-5’, so DNA polymerase can only add new nucleotides to the free, 3’ end of a primer. So maybe the real scumbag here is the DNA polymerase. Since the lagging strand runs in the opposite direction, it has to be copied as a series of segments. Here that awesome little enzyme RNA primase does its thing again, laying down an occasional short little RNA primer that gives the DNA polymerase a starting point to then work backwards along the strand. This is done in a ton of individual segments, each 1,000 to 2,000 base pairs long and each starting with an RNA primer, called Okazaki fragments after the couple of married scientists who discovered these step of the process in the 1960s. And thank God they were married so we can just call them Okazaki fragments instead of Okazaki-someone’s-someone fragments. These allow the strands to be synthesised in short bursts. Then another kind of DNA polymerase has to go back over and replace all those RNA primers and THEN all of the little fragments get joined up by a final enzyme called DNA ligase. And that is why I say the lagging strand is such a scumbag! 

DNA replication gets it wrong about one in every 10 billion nucleotides. But don’t think your body doesn’t have an app for that. It turns out that DNA polymerase can also proofread, in a sense, removing nucleotides from the end of a strand when they discover a mismatched base because the last thing we want is an A when it would have been a G! Considering how tightly packed DNA is into each one of our cells, it’s honestly amazing that more mistakes don’t happen. Remember, we’re talking about millions of miles worth of this stuff inside us. And this, my friends, is why scientists are not exaggerating when they call DNA the most celebrated molecule of all time. 

So you might as well look this episode over a couple of times and appreciate it for yourself. And in the meantime, gear up for next week, when we’re going to talk about how those six-feet kick-ass, actually make you, you. 

Thank you to all the people here at Crash Course who helped make this episode awesome. You can click on any of these things to go back to that section of the video.

If you have any questions, please, of course, ask them in the comments or on Facebook or Twitter. 



The Human Genome Project—discovering the human blueprint

Latest videos

The science of HIV — how close are we to a cure?

Video: The science of HIV — how close are we to a cure?

Can love be explained by science?

Video: Can love be explained by science?

Revolution in treating childhood cancer

Video: Revolution in treating childhood cancer