AUSTRALIAN FRONTIERS OF SCIENCE, 2003
Canberra, 31 July to 1 August 2003
Zinc binding
proteins as molecular scaffolds for drug design
by Dr Joel Mackay
![]() |
Joel Mackay is currently a senior lecturer in the School of Molecular and Microbial Biosciences at the University of Sydney. He finished a BSc and MSc at the University of Auckland before completing a PhD at the University of Cambridge. Joel has a strong background in biomolecular recognition and nuclear magnetic resonance spectroscopy, which form the basis for his current research, understanding the mechanisms of gene regulation at a molecular and atomic level. He has published over 50 journal articles and books chapters and has received a number of awards for his research, including the Roche Molecular Biochemicals Medal from the Australian Society for Biochemistry and Molecular Biology in 2001, and the ANZMAG Young Investigator Medal and Minister's Prize for Achievement in the Life Sciences in 2002. |
We have been working for the last six or seven years in our lab on trying to understand proteins called transcription factors, which are involved in regulating gene expression. Some of the insights that we have got from carrying out that work have led us to start thinking instead not about proteins that are found in Nature but about trying to design new proteins, proteins with functions that we choose and that, therefore, might be able to be used as scaffolds in some way and you will see what I mean by that soon in the process of designing drugs. So we are talking about protein-based drugs here.
Just a little bit of a refresher course, since this is a general audience, to remind you a little bit about proteins: proteins are quite amazing creatures, actually. They are polymers, as I am sure everyone realises, but they are polymers with quite unique properties.
Normally, when we think about polymers we think about things like a plastic bucket, or about a tin of paint, perhaps. Polymers like this are incredibly heterogeneous. So if you take a polythene recycling bin, you have got a huge range of different lengths of polythene in that plastic bin. You have got things that might be 1000 polymer units long, something that might be 10,000 or 100,000 there is a huge range of different lengths of polymers in there. And they dont have any fixed shape. They are all just jumbled up and lying there in different orientations, different conformations, to make up that plastic bucket. In contrast, proteins are incredibly well defined.
Another thing that is unusual about them is that they are made up of 20 different sub-units, 20 different amino acids, as opposed to a substance like polythene, which has just got one sub-unit, ethylene. These 20 different sub-units allow proteins to take on different shapes and to have a variety of different functions.
One of the other things is that every single copy of a protein is identical. That is, all of the trillions and trillions of copies of haemoglobin that are running round your bloodstream at the moment, for example, look absolutely identical, not only in their covalent molecular structure but also in their three-dimensional shape. (I will show you that in just a moment.) They all take up these unique three-dimensional conformations, and it is these unique conformations and the particular amino acid sequence that is essential for determining what the function of a protein will be.
What is the function of a protein? Well, proteins basically carry out almost all of the functions in any organism. There are obviously other molecules that do particular jobs, but by far the majority of jobs that are done in a living cell are carried out by proteins. We are used to thinking of lots of proteins, things like hair or fur and things like muscle. But it is not only things like muscle and fur that we are used to thinking about, but other things like signalling, insulin, growth hormone, EPO they are all things you have heard about are all proteins. Things like haemoglobin and ferritin are transport proteins; enzymes are all catalysts; antibodies that run your immune system, things like DNA-binding proteins that are involved in regulation, and things like toxins that spiders or snakes or scorpions carry these things are very often proteins as well. So proteins carry out this vast array of functions.
![]()
Click on image for a larger version of figure 1
How do they do it? Figure 1 shows how your average protein is made up. You have this linear sequence of amino acids, and coming off this linear sequence you have these things called side chains. Each amino acid has its own individual side chain that has a different shape, as you can see, and therefore different functional properties. These linear chains fold up into particular shapes things called α-helices and β-sheets. These α-helices and β-sheets then fold up into what we call tertiary structures you can see the little helix there and a bit of β-sheet or domains. A protein may contain many individually folded domains like this, that all have their own unique shape and can be strung out along the protein like this. They also can have quaternary structure, more than one protein chain. There are four protein chains in this protein here, and they come together to give the protein its function.
![]()
Click on image for a larger version of figure 2
Just to take a little bit closer look at protein structures, because you are going to see them later on in my talk and in Bostjans talk as well: figure 2 is a picture of haemoglobin. Now, it is not a very useful picture of haemoglobin. You cant really get much information out of looking at that, because that is showing you all of the covalent bonds in the protein, and that is far too complicated.
![]()
Click on image for a larger version of figure 3
Remember, as I said, haemoglobin has got four different chains. In figure 3 you can see those four different chains, coloured differently, there. But it is still very hard to see what is going on.
![]()
Click on image for a larger version of figure 4
So normally we dont look at the whole protein in every covalent bond; we just look at the backbone of the protein. This is the backbone you can see spiralling down here in an α-helix, and another α-helix here and so on (figure 4). We are not showing any of the side chains that are coming off the main chain, the backbone here. It makes it much easier just to get a feel for the three-dimensional shape of the protein and what is going on.
![]()
Click on image for a larger version of figure 5
Generally we actually draw them more like this (figure 5), where we show the individual α-helices more clearly. You can see they are connected by parts that are not α-helical. So there is a helix, a non-helical bit, another helix, and so on.
![]()
Click on image for a larger version of figure 6
These are the business ends of the haemoglobin molecule (figure 6). These are ion atoms here. There are four ion atoms, two there and another two that you cant see so well. These four ion atoms each bind to a single molecule of oxygen, and this is what transports the oxygen around your body.
Figure 7 |
Figure 8 |
| Click on images for larger versions | |
This is really more what a protein looks like (figure 7). This is the surface of a protein. It does not look like the pretty chain that we saw; it is more like a big amorphous blob, really, when you look at the surface. And it is this surface that in many cases has a function in contacting other molecules maybe contacting DNA or contacting another protein (figure 8).
![]()
Click on image for a larger version 9
If we look at a protein like DNA polymerase, for example (figure 9), here there are two protein chains. (Again we have just drawn the backbone.) This protein is involved in binding to DNA, so the DNA sits right in the middle there. It is quite an amazing structure, really: these two protein chains come together to make this doughnut shape that lies around the DNA, and it can run up and down the DNA, making relatively weak, non-specific interactions with the DNA but still being able to interact with the DNA enough to stay on there.
![]()
Click on image for a larger version of figure 10
So what are we working on? We are working on these protein domains called zinc fingers. And what is a zinc finger? Well, now you are used to looking at protein structures you can see that this is a very simple protein structure here (figure 10). The chain just comes up here, comes down here and then twiddles around here in a little α-helix. But the important thing about a zinc finger is that it has a zinc atom here. There are different types of zinc fingers that have, in some cases, two or three zinc atoms, and the purpose of these zinc atoms is to stabilise this three-dimensional shape here. You can see that this is a very, very small structure, and if you didnt have the zinc atom there it would not be able to maintain that stable, three-dimensional shape.
Because these shapes are very small and very stable, it has made us wonder whether these shapes might actually be useful for doing something that we choose, and that is what I want to tell you about today.
![]()
Click on image for a larger version of figure 11
These zinc-binding domains, these zinc fingers, are very, very common. That is the other thing I wanted to mention before I go on any more. In fact, they are the second most common type of protein structural motif that is found in humans. It is the most common in flies, and as you can see (figure 11) it is relatively common in other organisms as well. Of the 30,000 or so human genes, about 1000 code for proteins that contain these zinc finger or zinc-binding domains, which means there are over 15,000 of these zinc finger domains banging round in your body at any one time. And then there are proteins like this one here, where each one of these little green blobs indicates a zinc finger. So this guy has got 37 individual zinc fingers all lined up along the protein sequence.
![]()
Click on image for a larger version of figure 12
So they are very, very common motifs.
![]()
Click on image for a larger version of figure 13
We think, as well as having a variety of structures and you can see a whole bunch of different types of zinc finger structures shown in figure 12 they have a variety of different functions (figure 13). Some zinc-binding domains are involved in binding to DNA. Here is some DNA in green and here is a protein with three zinc fingers. Those zinc fingers contact the DNA and bind to it, and are involved in regulating the expression of certain genes. These are two other zinc fingers here, and their function is not to interact with DNA but to interact with each other. The red amino acids on this protein and the gold ones on this one contact each other and form a protein-protein complex. In fact, these are two other transcription factors proteins that are involved in regulating gene expression by contacting each other and making these specific contacts.
Seeing this range of structures and this range of functions that these zinc-binding domains can carry out has made us wonder whether we can make zinc-binding domains of our own choosing, ones that do not contact something that Nature has designed them to contact, but contact something that we choose them to contact contact a specific protein target, a specific DNA target or a specific RNA target, a target that might be medically useful, for example, or maybe just useful for probing the function of proteins found within an organism.
![]()
Click on image for a larger version of figure 14
We can think of some sort of process shown in figure 14, where we take some kind of zinc finger template structure, which is small, stable, compact, and go through some sort of design process. We might design some sort of new surface onto this protein, as shown here in yellow, where this new surface, these new amino acids, are capable of contacting some target protein, some protein that has a particular function that we would like to interfere with or enhance, or something like that. Can we do something like that?
![]()
Click on image for a larger version of figure 15
This is not an altogether new concept, because antibodies have been doing this for a very, very long time now. When you think of your typical antibody (figure 15) most of this antibody is basically a scaffold. It doesnt really do very much except sit there and create a scaffold onto which these little bits on the end here just these bits here and these bits here are able to change. You have got hundreds of thousands of different antibodies running around in you. The only thing that differs between one of your antibodies and the next is this little bit here [at far left of diagram], basically, and this little bit here [at far right of diagram]. It is changing these little bits that allows different antibodies to recognise different antigens.
![]()
Click on image for a larger version of figure 16
This is why people have started using antibodies as drugs. You can make an antibody that is targeted against some specific cell surface protein that is perhaps dysfunctional, and by making an antibody that can interact with this target you can block the function of this character shown in figure 16. That is being used in many cases as therapies against specific diseases, and these are very, very common types of drugs that are coming through clinical trials at the moment.
![]()
Click on image for a larger version of figure 17
The problem with antibodies as drugs is that they are good against cell surface proteins, against cell surface targets, but they are not so good at going inside the cell (figure 17). If you have got a target that is inside the cell, it is hard to get an antibody in, (a) because it has these particular covalent bonds here called disulfide bonds, which very often are not stable inside cells, and (b) because they are very, very big, and it is not easy to get something as big as an antibody through a cell membrane.
![]()
Click on image for a larger version of figure 18
In contrast, here is your average zinc finger, compared with your average antibody (figure 18). You can see that they are much, much smaller. We also know that zinc finger domains are stable inside cells, because the 15,000 or so zinc fingers that I mentioned before are all intracellular proteins. They are all things that are found inside the cell. These sorts of properties that zinc fingers have make us think that maybe this design process, of making drugs out of zinc finger proteins instead of out of antibodies, might be a go.
![]()
Click on image for a larger version figure 19
So what have we done? I will tell you a little bit about some stuff that we have been doing. We were interested initially in a protein called CBP this is just a schematic representation of the protein (figure 19). This is work that was done by Belinda Sharpe, a student that has just finished her PhD in my lab. What we know about this protein is that it has these two regions here, and the thing about these regions is that we think that they are zinc-binding domains and that what they are able to do is interact with all of these different protein partners. So this little part here is able to interact with eight or 10 and probably more different partner proteins. That made us wonder how it can do this: how can these zinc finger domains interact with so many different partners? And that is something that gives you a hint that that might be a useful property for designing a zinc finger as a drug. So we wanted to look at the molecular shape, the three-dimensional conformation of these particular domains from this protein.
![]()
Click on image for a larger version of figure 20
To do that we used a technique called NMR spectroscopy. Just to give you a very brief idea of how it works: basically it is a form of spectroscopy, nuclear magnetic resonance spectroscopy. As in any other form of spectroscopy, you have your sample at equilibrium your protein sample and you zap it with some sort of radiation. What you generate is an excited state, and you can observe that excited state and get some sort of what we call a spectrum. This is a proton or a hydrogen NMR spectrum of a protein (figure 20).
The thing about NMR is that it has incredibly high resolution. Every one of these individual lines that you can see in this spectrum represents a single hydrogen atom in the protein. That is, each of the hundreds or perhaps thousands of hydrogen atoms in the protein gives rise to a single line. So we can get a lot of information.
![]()
Click on image for a larger version of figure 21
What we can do, for example, is to figure out which hydrogen atom in the protein corresponds to which of these signals. So we can say that this signal is from a particular hydrogen atom, this is from another one and so on. Then what we can do is run another NMR experiment which tells us which pairs of hydrogen atoms are close to each other in space (figure 21).
![]()
Click on image for a larger version of figure 22
You can imagine that if you know the sequence of a protein, you know its covalence structure, and you know which hydrogen atoms are close to which other hydrogen atoms, you could feed that information into a computer program and it will tell you what three-dimensional shape is consistent with all of those so-called distance constraints and is consistent with the protein sequence (figure 22).
![]()
Click on image for a larger version of figure 23
When we did that, we found that this was the three-dimensional shape of this particular domain from the CBP protein that we were interested in (figure 23). And it was nice to see that it was a new shape, a shape that no-one had seen before. The other interesting thing about is that it is very, very small. When we discovered this, it was one of the smallest stable proteins that had ever been seen.
![]()
Click on image for a larger version of figure 24
This made us wonder at this point: just how stable is this thing? It is a very, very small structure. Surely it cant be that stable. So to try and probe how stable this structure was, we made a whole range of mutations. That is, we took the sequence of the protein each letter in figure 24 represents an amino acid; there are only 25 amino acids, a very small number and we changed 10 or 15 of those 25 amino acids to alanine, which is a fairly nondescript amino acid. So what we have done is to mutate 50 or 60 per cent of the protein sequence here. We know that the sequence determines the structure, so the question was whether something that has been so drastically mutated still form the same structure.
![]()
Click on image for a larger version of figure 25
And the answer was yes, somewhat to our surprise. This is the structure I showed you before, and these are the structures of two of those mutants that we made, where there is at least 50 or 60 per cent of the amino acid sequence altered (figure 25). But you can see they were still able to take up pretty much exactly the same three-dimensional shape, which shows us that this shape is a very, very robust one.
![]()
Click on image for a larger version of figure 26
If it is a very robust one, it makes us think that maybe it could be a useful scaffold for protein design, for tacking some function on to the shape, because we know it is very stable to mutations (figure 26).
![]()
Click on image for a larger version of figure 27
So this was the idea that we used. We said, ‘All right, to test this out, lets say, “Heres a different protein, and its function is to bind to a piece of DNA.” This protein does this, by using the red amino acids that are shown in figure 27. If we take those red amino acids, preserve their orientation in space and just try and glue them onto our protein, can we make a new protein that has these amino acids transplanted, grafted, onto it, that could then mimic the function of this other protein? Would it then be able to bind to DNA or to whatever target we chose?
![]()
Click on image for a larger version of figure 28
The target we chose was the protein shown in figure 28, a protein we had worked on in the lab already, called FOG. The thing about this protein is that it uses these six amino acids shown coloured here to contact another protein, called GATA. So it forms a protein-protein complex, FOG and GATA. So what we did was to say, ‘Can we take these six amino acids and transplant them onto our little structure here, and form a new structure, a mimic of FOG something that could potentially interfere with the interaction between FOG and GATA by binding to GATA and competing with the natural FOG?
![]()
Click on image for a larger version of figure 29
So we did this. We designed and we synthesised and determined the structure of a mimic of FOG, and it looked like this (figure 29). You can see it looks the same as the original wild-type structure, which again was very pleasing to see.
![]()
Click on image for a larger version of figure 30
And then we carried out an assay to try and see whether it was able to interact with the GATA protein (figure 30). We recorded a type of NMR spectrum of the GATA protein where each of these signals represents an individual amino acid in the protein. The thing about this experiment is that the position of these signals tells you very precisely about the physical environment of that particular amino acid. So if we run this experiment on GATA alone, and then we add some FOG or our FOG mimic into the solution, some of the signals will shift and some of them might disappear. Those will be the amino acids that are involved in mediating the interaction, because when you form a protein complex it is those amino acids at the interface that are obviously involved in forming that interaction and therefore change their environment when the complex is formed.
![]()
Click on image for a larger version of figure 31
When we ran this experiment, we ran our experiment on GATA alone (figure 31), and then we ran the experiment in the presence of the FOG mimic. You can see that there are certainly some changes.
![]()
Click on image for a larger version of figure 32
If we look in detail at those changes, we can map them onto the structure (figure 32) this is a GATA protein here, and these are the amino acids on the GATA protein that underwent the greatest changes in the experiment I just showed you, and so this is the surface of the GATA-1 protein that our FOG mimic is binding to. So our mimic was actually able to bind to the target, which was very pleasing to see, and it bound to the specific surface here.
![]()
Click on image for a larger version of figure 33
That was very exciting to see, except when we realised that the surface that we were trying to make it interact with was this surface down here, because this is the surface that the real FOG protein interacts with on GATA (figure 33).
So what this has shown us is that just very early on we were able to take this very small protein structure that we were looking at, which was this guy here, and we were able to manipulate its sequence without changing its structure, quite extensively. Not only were we able to do that, but we were able to make a version of it that was able to interact with the target which wasnt in this case DNA but this GATA protein but it did not interact with it quite in the way that we had predicted that it was going to interact with it.
![]()
Click on image for a larger version of figure 34
Figure 34 provides a summary. What this tells us is that structures like these very small zinc fingers are very tolerant to mutation and are able, we think, to probably act as useful scaffolds for protein design, but that actually it might be a little bit harder than we might initially think. Even though this thing has only got 25 amino acids its structure is very, very stable and it is very, very small even with something that is very, very small it shows us the enormity of the task of manipulating proteins to create functions that we might choose or might be interested in creating.



