About this Author
College chemistry, 1983
The 2002 Model
After 10 years of blogging. . .
Derek Lowe, an Arkansan by birth, got his BA from Hendrix College and his PhD in organic chemistry from Duke before spending time in Germany on a Humboldt Fellowship on his post-doc. He's worked for several major pharmaceutical companies since 1989 on drug discovery projects against schizophrenia, Alzheimer's, diabetes, osteoporosis and other diseases.
To contact Derek email him directly: firstname.lastname@example.org
July 22, 2014
So, when you put some diverse small molecules into cellular assays, how many proteins are they really hitting? You may know a primary target or two that they're likely to interact with, or (if you're doing phenotypic screening), you may not have any idea at all. But how many proteins (or other targets) are there that bind small molecules at all?
This is a question that many people are interested in, but hard data to answer it are not easily obtained. There have been theoretical estimates via several techniques, but (understandably) not too much experimental evidence. Now comes this paper from Ben Cravatt's group, and it's one of the best attempts yet.
What they've done is to produce a library of compounds, via Ugi chemistry, containing both a photoaffinity handle and an alkyne (for later "click" tagging). They'd done something similar before, but the photoaffinity group in that case was a benzophenone, which is rather hefty. This time they used a diazirine, which is both small and the precursor to a very reactive carbene once it's irradiated. (My impression is that the diazirine is the first thing to try if you're doing photoaffinity work, for just those reasons). They made a small set of fairly diverse compounds (about 60), with no particular structural biases in mind, and set out to see what these things would label.
They treated PC-3 cells (human prostate-cancer derived) with each member of the library at 10 µM, then hit them with UV to do the photoaffinity reaction, labeled with a fluorescent tag via the alkyne, and fished for proteins. What they found was a pretty wide variety, all right, but not in the nonselective shotgun style. Most compounds showed distinct patterns of protein labeling, and most proteins picked out distinct SAR from the compound set. They picked out six members of the library for close study, and found that these labeled about 24 proteins (one compound only picked up one target, while the most promiscuous compound labeled nine). What's really interesting is that only about half of these were known to have any small-molecule ligands at all. There were proteins from a number of different classes, and some (9 out of 24) weren't even enzymes, but rather scaffolding and signaling proteins (which wouldn't be expected to have many small-molecule binding possibilities).
A closer look at non-labeled versions of the probe compounds versus more highly purified proteins confirmed that the compounds really are binding as expected (in some cases, a bit better than the non-photoaffinity versions, in some cases worse). So even as small a probe as a diazirine is not silent, which is just what medicinal chemists would have anticipated. (Heck, even a single methyl or fluoro isn't always silent, and a good thing, too). But overall, what this study suggests is that most small molecules are going to hit a number of proteins (1 up to a dozen?) in any given cell with pretty good affinity. It also (encouragingly) suggests that there are more small-molecule binding sites than you'd think, with proteins that have not evolved for ligand responses still showing the ability to pick things up.
There was another interesting thing that turned up: while none of the Ugi compounds was a nonselective grab-everything compound, some of the proteins were. A subset of proteins tended to pick up a wide variety of the non-clickable probe compounds, and appear to be strong, promiscuous binders. Medicinal chemists already know a few of these things - CYP metabolizing enzymes, serum albumin, and so on. This post has some other suggestions. But there are plenty more of them out there, unguessable ones that we don't know about yet (in this case, PTGR and VDAC subtypes, along with NAMPT). There's a lot to find out.
+ TrackBacks (0) | Category: Chemical Biology | Drug Assays
July 16, 2014
If you ever find yourself needing to make large cyclic peptides, you now have a new option. This paper in Organic Letters describes a particularly clean way to do it: let glutathione-S-transferase (GST) do the work for you. Bradley Pentelute's group at MIT reports that if your protein has a glutathione attached at one end, and a pentafluoroaryl Cys at the other, that GST will step in and promote the nucleophilic aromatic substitution reaction to close the two ends together.
This is an application of their earlier work on the uncatalyzed reaction and on the use of GST for ligation.. Remarkably, the GST method seems to product very high yields of cyclic peptides up to at least 40 residues, and at reasonable concentration (10 mM) of the starting material, under aqueous conditions. Cyclic peptides themselves are interesting beasts, often showing unusual properties compared to the regular variety, and this method look as it will provide plenty more of them for study.
+ TrackBacks (0) | Category: Chemical Biology | Chemical News
July 14, 2014
What's the best carrier to take some sort of therapeutic agent into the bloodstream? That's often a tricky question to work out in animal models or in the clinic - there are a lot of possibilities. But what about using red blood cells themselves?
That idea has been in the works for a few years now, but there's a recent paper in PNAS reporting on more progress (here's a press release). Many drug discovery scientists will have encountered the occasional compound that partitions into erythrocytes all by itself (those are usually spotted by their oddly long half-lives after in vivo dosing, mimicking the effect of plasma protein binding). One of the early ways that people have attempted to try this deliberately was forcing a compound into the cells, but this tends to damage them and make them quite a bit less useful. A potentially more controllable method would be to modify the surfaces of the RBCs themselves to serve as drug carriers, but that's quite a bit more complex, too. Antibodies have been tried for this, but with mixed success.
That's what this latest paper addresses. The authors (the Lodish and Ploegh groups at Whitehead/MIT) introduce modified surface proteins (such as glycophorin A) that are substrates for Ploegh's sortase technology (two recent overview papers), which allows for a wide variety of labeling.
Experiments using modified fetal cells in irradiated mice gave animals that had up to 50% of their RBCs modified in this way. Sortase modification of these was about 85% effective, so plenty of label can be introduced. The labeling process doesn't appear to affect the viability of the cells very much as compared to wild-type - the cells were shown to circulate for weeks, which certainly breaks the records held by the other modified-RBC methods.
The team attached either biotin tags and specific antibodies to both mouse and human RBCs, which would appear to clear the way for a variety of very interesting experiments. (They also showed that simultaneous C- and N-terminal labeling is feasible, to put on two different tags at once). Here's the "coming attractions" section of the paper:
he approach presented here has many other possible applications; the wide variety of possible payloads, ranging from proteins and peptides to synthetic compounds and fluorescent probes, may serve as a guide. We have conjugated a single-domain antibody to the RBC surface with full retention of binding specificity, thus enabling the modified RBCs to be targeted to a specific cell type. We envision that sortase-engineered cells could be combined with established protocols of small-molecule encapsulation. In this scenario, engineered RBCs loaded with a therapeutic agent in the cytosol and modified on the surface with a cell type-specific recognition module could be used to deliver payloads to a precise tissue or location in the body. We also have demonstrated the attachment of two different functional probes to the surface of RBCs, exploiting the subtly different recognition specificities of two distinct sortases. Therefore it should be possible to attach both a therapeutic moiety and a targeting module to the RBC surface and thus direct the engineered RBCs to tumors or other diseased cells. Conjugation of an imaging probe (i.e., a radioisotope), together with such a targeting moiety also could be used for diagnostic purposes.
This will be worth keeping an eye on, for sure, both as a new delivery method for small (and not-so-small) molecules, fof biologics, and for its application to all the immunological work going on now in oncology. This should keep everyone involved busy for some time to come!
+ TrackBacks (0) | Category: Biological News | Chemical Biology | Pharmacokinetics
June 2, 2014
Last year I mentioned an interesting paper that managed to do single-cell pharmacokinetics on olaparib, a poly(ADP) ribose polymerase 1 (PARP1) inhibitor. A fluorescently-tagged version of the drug could be spotted moving into cells and even accumulating in the nucleus. The usual warnings apply: adding a fluorescent tag can disturb the various molecular properties that you're trying to study in the first place. But the paper did a good set of control experiments to try to get around that problem, and this is still the only way known (for now) to get such data.
The authors are back with a follow-up paper that provides even more detail. They're using fluorescence polarization/fluorescence anisotropy microscopy. That can be a tricky technique, but done right, it provides a lot of information. The idea (as the assay-development people in the audience well know) is that when fluorescent molecules are excited by polarized light, their emission is affected by how fast they're rotating. If the rotation is slowed down to below the fluorescence lifetime of the molecules (as happens when they're bound to a protein), then you see more polarization in the emitted light, but if the molecules are tumbling around freely, that's mostly lost. There are numerous complications - you need to standardize each new system according to how much things change in increasingly viscous solutions, the fluorophores can't get too close together, you have to be careful with the field of view in your imaging system to avoid artifacts - but that's the short form.
In this case, they're using near-IR light to do the excitation, because those wavelengths are well known to penetrate living cells well. Their system also needs two photons to excite each molecule, which improves signal-to-noise and the two-photon dye is a BODIPY compound. These things have been used in fluorescence studies with wild abandon for the past few years - at one point, I was beginning to think that the acronym was a requirement to get a paper published in Chem. Comm. They have a lot of qualities (cell penetration, fluorescence lifetime, etc.) that make them excellent candidates for this kind of work.
This is the same olaparib/BODIPY hybrid used in the paper last year, and you see the results. The green fluorescence is nonspecific binding, while the red is localized to the nuclei, and doesn't wash out. If you soak the cells with unlabeled olaparib beforehand, though, you don't see this effect at all, which also argues for the PARP1-bound interpretation of these results. This paper takes things even further, though - after validating this in cultured cells, they moved on to live mice, using an implanted window chamber over a xenograft.
And they saw the same pattern: quick cellular uptake of the labeled drug on infusion into the mice, followed by rapid binding to nuclear PARP1. The intracellular fluorescence then cleared out over a half-hour period, but the nuclear-bound compound remained, and could be observed with good signal/noise. This is the first time I've seen an experiment like this. Although it's admittedly a special case (which takes advantage of a well-behaved fluorescently labeled drug conjugate, to name one big hurdle), it's a well-realized proof of concept. Anything that increases the chances of understanding what's going on with small molecules in real living systems is worth paying attention to. It's interesting to note, by the way, that the olaparib/PARP1 system was also studied in that recent whole-cell thermal shift assay technique, which does not need modified compounds. Bring on the comparisons! These two techniques can be used to validate each other, and we'll all be better off.
+ TrackBacks (0) | Category: Biological News | Chemical Biology | Pharmacokinetics
May 30, 2014
Many drug discovery researchers now have an idea of what to expect when a fragment library is screened against a new target. And some have had the experience of screening covalent, irreversible inhibitor structures against targets (a hot topic in recent years). But can you screen with a library of irreversibly-binding fragments?
This intersection has occurred to more than one group, but this paper marks the first published example that I know of. The authors, Alexander Statsyuk and co-workers at Northwestern, took what seems like a very sound approach. They were looking for compounds that would modify the active-site residues of cysteine proteases, which are the most likely targets in the proteome. But balancing the properties of a fragment collection with those of a covalent collection is tricky. Red-hot functional groups will certainly label your proteins, but they'll label the first things they see, which isn't too useful. If you go all the way in the other direction, epoxides are probably the least reactive covalent modifier, but they're so tame that unless they fit into a binding site perfectly, they might not do anything at all - and what are the chances that a fragment-sized molecule will bind that well? How much room is there in the middle?
That's what this paper is trying to find out. The team first surveyed a range of reactive functional groups against a test thiol, N-acetylcysteine. They attached an assortment of structures to each reactive end, and they were looking for two things: absolute reactivity of each covalent modifier, and how much it mattered as their structures varied. Acrylamides dropped out as a class because their more reactive examples were just too hot - their reactivity varied up to 2000x across a short range of examples. Vinylsulfonamides varied 8-fold, but acrylates and vinylsulfones were much less sensitive to structural variation. They picked acrylates as the less reactive of the two.
A small library of 100 diverse acrylates were then prepared (whose members still only varied about twofold in reactivity), and these were screened (100 micromolar) against papain as a prototype cysteine protease. They'd picked their fragments so that everything had a distinct molecular weight, so whole-protein mass spec could be used as a readout. Screening ten sets of ten mixtures showed that the enzyme picked out three distinct fragments from the entire set, a very encouraging result. Pretreatment of the enzyme with a known active-site labeling inhibitor shut down any reaction with the three hits, as it should have.
Keep in mind that this also means that 97 reasonably-sized acrylates were unable to label the very reactive Cys in the active site of papain, and that they did not label any surface residues. This suggests that the compounds that did make it in did so because of some structure-driven binding selectivity, which is just the territory that you want to be in. Adding an excess of glutathione to the labeling experiments did not shut things down, which also suggests that these are not-very-reactive acrylates whose structures are giving them an edge. Screen another enzyme, and you should pick up a different set of hits.
And that's exactly what they did next. Screening a rhinovirus cysteine protease (HRV3C) gave three totally new hits - not as powerful against that target as the other three were against papain, but real hits. Two other screens, against USP08 and UbcH7, did not yield any hits at all (except a couple of very weak ones against the former when the concentration was pushed hard). A larger reactive fragment library would seem to be the answer here; 100 compounds really isn't very much, even for fragment space, when you get down to it.
So this paper demonstrates that you can, in fact, find an overlap between fragment space and covalent inhibition, if you proceed carefully. Now here's a question that I'm not sure has ever been answered: if you find such a covalent fragment, and optimize it to be a much more potent binder, can you then pull the bait-and-switch by removing the covalent warhead, and still retain enough potency? Or is that too much to ask?
+ TrackBacks (0) | Category: Chemical Biology | Chemical News | Drug Assays
May 28, 2014
The Science paper on chemogenomic signatures that I went on about at great length has been revised. Figure 2, which drove me and every other chemist who saw it up the wall, has been completely reworked:
To improve clarity, the authors revised Fig. 2 by (i) illustrating the substitution sites of fragments; (ii) labeling fragments numerically for reference to supplementary materials containing details about their derivation; and (iii) representing the dominant tautomers of signature compounds. The authors also discovered an error in their fragment generation software that, when corrected, resulted in slightly fewer enriched fragments being identified. In the revised Fig. 2, they removed redundant substructures and, where applicable, illustrated larger substructures containing the enriched fragment common among signature compounds.
Looking it over in the revised version, it is indeed much improved. The chemical structures now look like chemical structures, and some of the more offensive "pharmacophores" (like tetrahydrofuran) have now disappeared. Several figures and tables have been added to the supplementary material to highlight where these fragments are in the active compounds (Figure S25, an especially large addition), and to cross-index things more thoroughly.
So the most teeth-gritting parts of the paper have been reworked, and that's a good thing. I definitely appreciate the work that the authors have put into making the work more accurate and interpretable, although these things really should have been caught earlier in the process.
Looking over the new Figure S25, though, you can still see what I think are the underlying problems with the entire study. That's the one where "Fragments that are significantly enriched in specific sets of signature compounds (FDR ≤ 0.1 and signature compounds fraction ≥ 0.2) are highlighted in blue within the relevant signature compounds. . .". It's a good idea to put something like that in there, but the annotations are a bit odd. For example, the compounds flagged as "6_cell wall" have their common pyridines highlighted, even though there's a common heterocyclic core that that all but one those pyridines are attached to (it only varies by alkyl substitutents). That single outlier compound seems to be the reason that the whole heterocycle isn't colored in - but there are plenty of other monosubstituted pyridines on the list that have completely different signatures, so it's not like "monosubstituted pyridine" carries much weight. Meanwhile, the next set ("7_cell wall") has more of the exact same series of heterocycles, but in this case, it's just the core heterocycle that's shaded in. That seems to be because one of them is a 2-substituted isomer, while the others are all 3-substituted, so the software just ignores them in favor of coloring in the central ring.
The same thing happens with "8_ubiquinone biosynthesis and proteosome". What gets shaded in is an adamantane ring, even though every single one of the compounds is also a Schiff base imine (which is a lot more likely to be doing something than the adamantane). But that functional group gets no recognition from the software, because some of the aryl substitution patterns are different. One could just as easily have colored in the imine, though, which is what happens with the next category ("9_ubiquinone biosynthesis and proteosome"), where many of the same compounds show up again.
I won't go into more detail; the whole thing is like this. Just one more example: "12_iron homeostasis" features more monosubstituted pyridines being highlighted as the active fragment. But look at the list: there's are 3-aminopyridine pieces, 4-aminomethylpyridines, 3-carboxylpyridines, all of them substituted with all kinds of stuff. The only common thread, according to the annotation software, is "pyridine", but those are, believe me, all sorts of different pyridines. (And as the above example shows, it's not like pyridines form some sort of unique category in this data set, anyway).
So although the most eye-rolling features of this work have been cleaned up, the underlying medicinal chemistry is still pretty bizarre, at least to anyone who knows any medicinal chemistry. I hate to be this way, but I still don't see anyone getting an awful lot of use out of this.
+ TrackBacks (0) | Category: Biological News | Chemical Biology | Chemical News | The Scientific Literature
April 14, 2014
This will be a long one. I'm going to take another look at the Science paper that stirred up so much comment here on Friday. In that post, my first objection (but certainly not my only one) was the chemical structures shown in the paper's Figure 2. A number of them are basically impossible, and I just could not imagine how this got through any sort of refereeing process. There is, for example, a cyclohexadien-one structure, shown at left, and that one just doesn't exist as such - it's phenol, and those equilibrium arrows, though very imbalanced, are still not drawn to scale.
Well, that problem is solved by those structures being intended as fragments, substructures of other molecules. But I'm still positive that no organic chemist was involved in putting that figure together, or in reviewing it, because the reason that I was confused (and many other chemists were as well) is that no one who knows organic chemistry draws substructures like this. What you want to do is put dashed bonds in there, or R groups, as shown. That does two things: it shows that you're talking about a whole class of compounds, not just the structure shown, and it also shows where things are substituted. Now, on that cyclohexadienone, there's not much doubt where it's substituted, once you realize that someone actually intended it to be a fragment. It can't exist unless that carbon is tied up, either with two R groups (as shown), or with an exo-alkene, in which case you have a class of compounds called quinone methides. We'll return to those in a bit, but first, another word about substructures and R groups.
Figure 2 also has many structures in it where the fragment structure, as drawn, is a perfectly reasonable molecule (unlike the example above). Tetrahydrofuran and imidazole appear, and there's certainly nothing wrong with either of those. But if you're going to refer to those as common fragments, leading to common effects, you have to specify where they're substituted, because that can make a world of difference. If you still want to say that they can be substituted at different points, then you can draw a THF, for example, with a "floating" R group as shown at left. That's OK, and anyone who knows organic chemistry will understand what you mean by it. If you just draw THF, though, then an organic chemist will understand that to mean just plain old THF, and thus the misunderstanding.
If the problems with this paper ended at the level of structure drawing, which many people will no doubt see as just a minor aesthetic point, then I'd be apologizing right now. Update: although it is irritating. On Twitter, I just saw that someone spotted "dihydrophyranone" on this figure, which someone figured was close enough to "dihydropyranone", I guess, and anyway, it's just chemistry. But they don't. It struck me when I first saw this work that sloppiness in organic chemistry might be symptomatic of deeper trouble, and I think that's the case. The problems just keep on coming. Let's start with those THF and imidazole rings. They're in Figure 2 because they're supposed to be substructures that lead to some consistent pathway activity in the paper's huge (and impressive) yeast screening effort. But what we're talking about is a pharmacophore, to use a term from medicinal chemistry, and just "imidazole" by itself is too small a structure, from a library of 3200 compounds, to be a likely pharmacophore. Particularly when you're not even specifying where it's substituted and how. There are all kinds of imidazole out there, and they do all kinds of things.
So just how many imidazoles are in the library, and how many caused this particular signature? I think I've found them all. Shown at left are the four imidazoles (and there are only four) that exhibit the activity shown in Figure 2 (ergosterol depletion / effects on membrane). Note that all four of them are known antifungals - which makes sense, given that the compounds were chosen for the their ability to inhibit the growth of yeast, and topical antifungals will indeed do that for you. And that phenotype is exactly what you'd expect from miconazole, et al., because that's their known mechanism of action: they mess up the synthesis of ergosterol, which is an essential part of the fungal cell membrane. It would be quite worrisome if these compounds didn't show up under that heading. (Note that miconazole is on the list twice).
But note that there are nine other imidazoles that don't have that same response signature at all - and I didn't even count the benzimidazoles, and there are many, although from that structure in Figure 2, who's to say that they shouldn't be included? What I'm saying here is that imidazole by itself is not enough. A majority of the imidazoles in this screen actually don't get binned this way. You shouldn't look at a compound's structure, see that it has an imidazole, and then decide by looking at Figure 2 that it's therefore probably going to deplete ergosterol and lead to membrane effects. (Keep in mind that those membrane effects probably aren't going to show up in mammalian cells, anyway, since we don't use ergosterol that way).
There are other imidazole-containing antifungals on the list that are not marked down for "ergosterol depletion / effects on membrane". Ketonconazole is SGTC_217 and 1066, and one of those runs gets this designation, while the other one gets signature 118. Both bifonazole and sertaconazole also inhibit the production of ergosterol - although, to be fair, bifonazole does it by a different mechanism. It gets annotated as Response Signature 19, one of the minor ones, while sertaconazole gets marked down for "plasma membrane distress". That's OK, though, because it's known to have a direct effect on fungal membranes separate from its ergosterol-depleting one, so it's believable that it ends up in a different category. But there are plenty of other antifungals on this list, some containing imidazoles and some containing triazoles, whose mechanism of action is also known to be ergosterol depletion. Fluconazole, for example, is SGTC_227, 1787 and 1788, and that's how it works. But its signature is listed as "Iron homeostasis" once and "azole and statin" twice. Itraconzole is SGTC_1076, and it's also annotated as Response Signature 19. Voriconazole is SGTC_1084, and it's down as "azole and statin". Climbazole is SGTC_2777, and it's marked as "iron homeostasis" as well. This scattering of known drugs between different categories is possibly and indicator of this screen's ability to differentiate them, or possibly an indicator of its inherent limitations.
Now we get to another big problem, the imidazolium at the bottom of Figure 2. It is, as I said on Friday, completely nuts to assign a protonated imidazole to a different category than a nonprotonated one. Note that several of the imidazole-containing compounds mentioned above are already protonated salts - they, in fact, fit the imidazolium structure drawn, rather than the imidazole one that they're assigned to. This mistake alone makes Figure 2 very problematic indeed. If the paper was, in fact, talking about protonated imidazoles (which, again, is what the authors have drawn) it would be enough to immediately call into question the whole thing, because a protonated imidazole is the same as a regular imidazole when you put it into a buffered system. In fact, if you go through the list, you find that what they're actually talking about are N-alkylimidazoliums, so the structure at the bottom of FIgure 2 is wrong, and misleading. There are two compounds on the list with this signature, in case you were wondering, but the annotation may well be accurate, because some long-chain alkylimidazolium compounds (such as ionic liquid components) are already known to cause mitochondrial depolarization.
But there are several other alkylimidazolium compounds in the set (which is a bit odd, since they're not exactly drug-like). And they're not assigned to the mitochondrial distress phenotype, as Figure 2 would have you think. SGTC_1247, 179, 193, 1991, 327, and 547 all have this moeity, and they scatter between several other categories. Once again, a majority of compounds with the Figure 2 substructure don't actually map to the phenotype shown (while plenty of other structural types do). What use, exactly, is Figure 2 supposed to be?
Let's turn to some other structures in it. The impossible/implausible ones, as mentioned above, turn out to be that way because they're supposed to have substituents on them. But look around - adamantane is on there. To put it as kindly as possible, adamantane itself is not much of a pharmacophore, having nothing going for it but an odd size and shape for grease. Tetrahydrofuran (THF) is on there, too, and similar objections apply. When attempts have been made to rank the sorts of functional groups that are likely to interact with protein binding sites, ethers always come out poorly. THF by itself is not some sort of key structural unit; highlighting it as one here is, for a medicinal chemist, distinctly weird.
What's also weird is when I search for THF-containing compounds that show this activity signature, I can't find much. The only things with a THF ring in them seem to be SGTC_2563 (the complex natural product tomatine) and SGTC_3239, and neither one of them is marked with the signature shown. There are some imbedded THF rings as in the other structural fragments shown (the succinimide-derived Diels-Alder ones), but no other THFs - and as mentioned, it's truly unlikely that the ether is the key thing about these compounds, anyway. If anyone finds another THF compound annotated for tubulin folding, I'll correct this post immediately, but for now, I can't seem to track one down, even though Table S4 says that there are 65 of them. Again, what exactly is Figure 2 supposed to be telling anyone?
Now we come to some even larger concerns. The supplementary material for the paper says that 95% of the compounds on the list are "drug-like" and were filtered by the commercial suppliers to eliminate reactive compounds. They do caution that different people have different cutoffs for this sort of thing, and boy, do they ever. There are many, many compounds in this collection that I would not have bothered putting into a cell assay, for fear of hitting too many things and generating uninterpretable data. Quinone methides are a good example - as mentioned before, they're in this set. Rhodanines and similar scaffolds are well represented, and are well known to hit all over the place. Some of these things are tested at hundreds of micromolar.
I recognize that one aim of a study like this is to stress the cells by any means necessary and see what happens, but even with that in mind, I think fewer nasty compounds could have been used, and might have given cleaner data. The curves seen in the supplementary data are often, well, ugly. See the comments section from the Friday post on that, but I would be wary of interpreting many of them myself.
There's another problem with these compounds, which might very well have also led to the nastiness of the assay curves. As mentioned on Friday, how can anyone expect many of these compounds to actually be soluble at the levels shown? I've shown a selection of them here; I could go on. I just don't see any way that these compounds can be realistically assayed at these levels. Visual inspection of the wells would surely show cloudy gunk all over the place. Again, how are such assays to be interpreted?
And one final point, although it's a big one. Compound purity. Anyone who's ever ordered three thousand compounds from commercial and public collections will know, will be absolutely certain that they will not all be what they say on the label. There will be many colors and consistencies, and LC/MS checks will show many peaks for some of these. There's no way around it; that's how it is when you buy compounds. I can find no evidence in the paper or its supplementary files that any compound purity assays were undertaken at any point. This is not just bad procedure; this is something that would have caused me to reject the paper all by itself had I refereed it. This is yet another sign that no one who's used to dealing with medicinal chemistry worked on this project. No one with any experience would just bung in three thousand compounds like this and report the results as if they're all real. The hits in an assay like this, by the way, are likely to be enriched in crap, making this more of an issue than ever.
Damn it, I hate to be so hard on so many people who did so much work. But wasn't there a chemist anywhere in the room at any point?
+ TrackBacks (0) | Category: Biological News | Chemical Biology | Chemical News | The Scientific Literature
April 11, 2014
Note: critique of this paper continues here, in another post.
A reader sent along a puzzled note about this paper that's out in Science. It's from a large multicenter team (at least nine departments across the US, Canada, and Europe), and it's an ambitious effort to profile 3250 small molecules in a broad chemogenomics screen in yeast. This set was selected from an earlier 50,000 compounds, since these realiably inhibited the growth of wild-type yeast. They're looking for what they call "chemogenomic fitness signatures", which are derived from screening first against 1100 heterozygous yeast strains, one gene deletion per, representing the yeast essential genome. Then there's a second round of screening against 4800 homozygous deletions strain of non-essential genes, to look for related pathways, compensation, and so on.
All in all, they identified 317 compounds that appear to perturb 121 genes, and many of these annotations are new. Overall, the responses tended to cluster in related groups, and the paper goes into detail about these signatures (and about the outliers, which are naturally interested for their own reasons). Broad pathway effects like mitrochondrial stress show up pretty clearly, for example. And unfortunately, that's all I'm going to say for now about the biology, because we need to talk about the chemistry in this paper. It isn't good.
As my correspondent (a chemist himself) mentions, a close look at Figure 2 of the paper raises some real questions. Take a look at that cyclohexadiene enamine - can that really be drawn correctly, or isn't it just N-phenylbenzylamine? The problem is, that compound (drawn correctly) shows up elsewhere in Figure 2, hitting a completely different pathway. These two tautomers are not going to have different biological effects, partly because the first one would exist for about two molecular vibrations before it converted to the second. But how could both of them appear on the same figure?
And look at what they're calling "cyclohexa-2,4-dien-1-one". No such compound exists as such in the real world - we call it phenol, and we draw it as an aromatic ring with an OH coming from it. Thiazolidinedione is listed as "thiazolidine-2,4-quinone". Both of these would lead to red "X" marks on an undergraduate exam paper. It is clear that no chemist, not even someone who's been through second-year organic class, was involved in this work (or at the very least, involved in the preparation of Figure 2). Why not? Who reviewed this, anyway?
There are some unusual features from a med-chem standpoint as well. Is THF really targeting tubulin folding? Does adamantane really target ubiquinone biosynthesis? Fine, these are the cellular effects that they noted, I guess. But the weirdest thing on Figure 2's annotations is that imidazole is shown as having one profile, while protonated imidazole is shown as a completely different one. How is this possible? How could anyone who knows any chemistry look at that and not raise an eyebrow? Isn't this assay run in some sort of buffered medium? Don't yeast cells have any buffering capacity of their own? Salts of basic amine drugs are dosed all the time, and they are not considered - ever - as having totally different cellular effects. What a world it would be if that were true! Seeing this sort of thing makes a person wonder about the rest of the paper.
More subtle problems emerge when you go to the supplementary material and take a look at the list of compounds. It's a pretty mixed bag. The concentrations used for the assays vary widely - rapamycin gets run at 1 micromolar, while ketoconazole is nearly 1 millimolar. (Can you even run that compound at that concentration? Or that compound at left at 967 micromolar? Is it really soluble in the yeast wells at such levels? There are plenty more that you can wonder about in the same way.
And I went searching for my old friends, the rhodanines, and there they were. Unfortunately, compound SGTC_2454 is 5-benzylidenerhodanine, whose activity is listed as "A dopamine receptor inhibitor" (!). But compound SGTC_1883 is also 5-benzylidenerhodanine, the same compound, run at similar concentration, but this time unannotated. The 5-thienylidenerhodanine is SGTC_30, but that one's listed as a phosphatase inhibitor. Neither of these attributions seem likely to me. There are other duplicates, but many of them are no doubt intentional (run by different parts of the team).
I hate to say this, but just a morning's look at this paper leaves me with little doubt that there are still more strange things buried in the chemistry side of this paper. But since I work for a living (dang it), I'm going to leave it right here, because what I've already noted is more than troubling enough. These mistakes are serious, and call the conclusions of the paper into question: if you can annotate imidazole and its protonated form into two different categories, or annotate two different tautomers (one of which doesn't really exist) into two different categories, what else is wrong, and how much are these annotations worth? And this isn't even the first time that Science has let something like this through. Back in 2010, they published a paper on the "Reactome" that had chemists around the world groaning. How many times does this lesson need to be learned, anyway?
Update: this situation brings up a number of larger issues, such as the divide between chemists and biologists (especially in academia?) and the place of organic chemistry in such high-profile publications (and the place of organic chemists as reviewers of it). I'll defer these to another post, but believe me, they're on my mind.
Update 2 Jake Yeston, deputy editor at Science, tells me that they're looking into this situation. More as I hear it.
Update 3: OK, if Figure 2 is just fragments, structural pieces that were common to compounds that had these signatures, then (1) these are still not acceptable structures, even as fragments, and (2), many of these don't make sense from a medicinal chemistry standpoint. It's bizarre to claim a tetrahydrofuran ring (for example) as the key driver for a class of compounds; the chance that this group is making an actual, persistent interaction with some protein site (or family of sites) is remote indeed. The imidazole/protonated imidazole pair is a good example of this: why on Earth would you pick these two groups to illustrate some chemical tendency? Again, this looks like the work of people who don't really have much chemical knowledge.
A closer look at the compounds themselves does not inspire any more confidence. There's one of them from Table S3, which showed a very large difference in IC50 across different yeast strains. It was tested at 400 micromolar. That, folks, was sold to the authors of this paper by ChemDiv, as part of a "drug-like compound" library. Try pulling some SMILES strings from that table yourself and see what you think about their drug likeness.
+ TrackBacks (0) | Category: Chemical Biology | Chemical News | The Scientific Literature
April 10, 2014
So here's the GSK paper on applying the DNA-encoded library technology to a protein-protein target. I'm particularly interested in seeing the more exotic techniques applied to hard targets like these, because it looks like there are plenty of them where we're going to need all the help we can get. In this case, they're going after integrin LFA-1. That's a key signaling molecule in leukocyte migration during inflammation, and there was an antibody (Raptiva, efalizumab) on the market, until it was withdrawn for too many side effects. (It dialed down the immune system rather too well). But can you replace an antibody with a small molecule?
A lot of people have tried. This is a pretty well-precedented protein-protein interaction for drug discovery, although (as this paper mentions), most of the screens have been direct PPI ones, and most of the compounds found have been allosteric - they fit into another spot on LFA-1 and disrupt the equilibrium between a low-affinity form and the high-affinity one. In this case, though, the GSK folks used their encoded libraries to screen directly against the LFA-1 protein. As usual, the theoretical number of compounds in the collection was bizarre, about 4 billion compounds (it's the substituted triazine library that they've described before).
An indanyl amino acid in one position on the triazine seemed to be a key SAR point in the resulting screen, and there were at least four other substituents at the next triazine point that kept up its activity. Synthesizing these off the DNA tags gave double-digit nanomolar affinities (if they hadn't, we wouldn't be hearing about this work, I'm pretty sure). Developing the SAR from these seems to have gone in classic med-chem fashion, although a lot of classic med-chem programs would very much like to be able to start off with some 50 nM compounds. The compounds were also potent in cell adhesion assays, with an interesting twist - the team also used a mutated form of LFA-1 where a disulfide holds it fixed in the high-affinity state. The known small-molecule allosteric inhibitors work against wild-type in this cell assay, but wipe out against the locked mutant, as they should. These triazines showed the same behavior; they also target the allosteric site.
That probably shouldn't have come as a surprise. Most protein-protein interactions have limited opportunities for small molecules to affect them, and if there's a known friendly spot like the allosteric site here, you'd have to expect that most of your hits are going to be landing on it. You wonder what might happen if you ran the ELT screen against the high-affinity-locked mutant protein - if it's good enough to work in cells, it should be good enough to serve in a screen for non-allosteric compounds. The answer (most likely) is that you sure wouldn't find any 50 nM leads - I wonder what you'd find at all? Running four billion compounds across a protein surface and finding no real hits would be a sobering experience.
The paper finishes up by showing the synthesis of some fluorescently tagged derivatives, and showing that these also work in cell assay. The last sentence is : "The latter phenomena provided an opportunity for ELT selections against a desired target in its natural state on cell surface. We are currently exploring this technology development opportunity." I wonder if they are? For the same reasons given above, you'd expect to find mostly allosteric binders, and those already seem to be findable. And it's my impression that this is the early-stage ELT stuff (the triazine library), plus, when you look at the list of authors, there are several "Present address" footnotes. So this work was presumably done a while back and is just now coming into the light.
So the question of using this technique against PPI targets remains open, as far as I can tell. This one had already been shown to yield small-molecule hits, and it did so again, in the same binding pocket. What happens when you set out into the unknown? Presumably, GlaxoSmithKline (and the other groups pursuing encoded libraries) know a lot more about than the rest of us do. Surely some screens like this have been run. Either they came up empty - in which case we'll never hear about them - or they actually yielded something interesting, in which case we'll hear about them over the next few years. If you want to know the answer before then, you're going to have to run some yourself. Isn't that always the way?
+ TrackBacks (0) | Category: Chemical Biology | Drug Assays
April 2, 2014
Many readers will be familiar, at least in principle, with the "thermal shift assay". It goes by other names as well, but the principle is the same. The idea is that when a ligand binds to a protein, it stabilizes its structure to some degree. This gets measured by watching its behavior as samples of bound and unbound proteins are heated up, and the most common way to detect those changes in protein structure (and stability) is by using a fluorescent dye. Thus another common name for the assay, DSF, for Differential Scanning Fluorimetry. The dye has a better chance to bind to the newly denatured protein once the heat gets to that point, and that binding even can be detected by increasing fluorescence. The assay is popular, since it doesn't require much in specialized equipment and is pretty straightforward to set up, compared to something like SPR. Here's a nice slide presentation that's up on the web from UC Santa Cruz, and here's one of many articles on using the technique for screening.
I bring this up because of this paper last suumer in Science, detailing what the authors (a mixed team from Sweden and Singapore) called CETSA, the cellular thermal shift assay. They trying to do something that is very worthwhile indeed: measuring ligand binding inside living cells. Someone who's never done drug discovery might imagine that that's the sort of thing that we do all the time, but in reality, it's very tricky. You can measure ligand binding to an isolated protein in vitro any number of ways (although they may or may not give you the same answer!), and you can measure downstream effects that you can be more (or less) confident are the result of your compound binding to a cellular target. But direct binding measurements in a living cell are pretty uncommon.
I wish they weren't. Your protein of interest is going to be a different beast when it's on the job in its native environment, compared to sitting around in a well in some buffer solution. There are other proteins for it to interact with, a whole local environment that we don't know enough to replicate. There are modifications to its structure (phosphorylation and others) that you may or may not be aware of, which can change things around. And all of these have a temporal dimension, changing under different cellular states and stresses in ways that are usually flat-out impossible to replicate ex vivo.
Here's what this new paper proposes:
We have developed a process in which multiple aliquots of cell lysate were heated to different temperatures. After cooling, the samples were centrifuged to separate soluble fractions from precipitated proteins. We then quantified the presence of the target protein in the soluble fraction by Western blotting . . .
Surprisingly, when we evaluated the thermal melt curve of four different clinical drug targets in lysates from cultured mammalian cells, all target proteins showed distinct melting curves. When drugs known to bind to these proteins were added to the cell lysates, obvious shifts in the melting curves were detected. . .
That makes it sound like the experiments were all done after the cells were lysed, which wouldn't be that much of a difference from the existing thermal shift assays. But reading on, they then did this experiment with methotrexate and its enzyme target, dihydrofolate reductase (DHFR), along with ralitrexed and its target, thymidylate synthase:
DHFR and TS were used to determine whether CETSA could be used in intact cells as well as in lysates. Cells were exposed to either methotrexate or raltitrexed, washed, heated to different temperatures, cooled, and lysed. The cell lysates were cleared by centrifugation, and the levels of soluble target protein were measured, revealing large thermal shifts for DHFR and TS in treated cells as compared to controls. . .
So the thermal shift part of the experiment is being done inside the cells themselves, and the readout is the amount of non-denatured protein left after lysis and gel purification. That's ingenious, but it's also the sort of idea that (if it did occur to you) you might dismiss as "probably not going to work" and/or "has surely already been tried and didn't work". It's to this team's credit that they ran with it. This proves once again the soundness of Francis Crick's advice (in his memoir What Mad Pursuitand other places) to not pay too much attention to your own reasoning about how your ideas must be flawed. Run the experiment and see.
A number of interesting controls were run. Cell membranes seem to be intact during the heating process, to take care of one big worry. The effect of ralitrexed added to lysate was much greater than when it was added to intact cells, suggesting transport and cell penetration effects. A time course experiment showed that it took two to three hours to saturate the system with the drug. Running the same experiment on starved cells gave a lower effect, and all of these point towards the technique doing what it's supposed to be doing - measuring the effect of drug action in living cells under real-world conditions.
There's even an extension to whole animals, albeit with a covalent compound, the MetAP2 inhibitor TNP-470. It's a fumagillin derivative, so it's a diepoxide to start off, with an extra chloroacetamide for good measure. (You don't need that last reactive group, by the way, as Zafgen's MetAP2 compound demonstrates). The covalency gives you every chance to see the effect if it's going to be seen. Dosing mice with the compound, followed by organ harvesting, cell lysis, and heating after the lysis step showed that it was indeed detectable by thermal shift after isolation of the enzyme, in a dose-responsive manner, and that there was more of it in the kidneys than the liver.
Back in the regular assay, they show several examples of this working on other enzymes, but a particularly good one is PARP. Readers may recall the example of iniparib, which was taken into the clinic as a PARP-1 inhibitor, failed miserably, and was later shown not to really be hitting the target at all in actual cells and animals, as opposed to in vitro assays. CETSA experiments on it versus olaparib, which really does work via PARP-1, confirm this dramatically, and suggest that this assay could have told everyone a long time ago that there was something funny about iniparib in cells. (I should note that PARP has also been a testbed for other interesting cell assay techniques).
This leads to a few thoughts on larger questions. Sanofi went ahead with iniparib because it worked in their assays - turns out it just wasn't working through PARP inhibition, but probably by messing around with various cysteines. They were doing a phenotypic program without knowing it. This CETSA technique is, of course, completely target-directed, unless you feel like doing thermal shift measurements on a few hundred (or few thousand) proteins. But that makes me wonder if that's something that could be done. Is there some way to, say, impregnate the gel with the fluorescent shift dye and measure changes band by band? Probably not (the gel would melt, for one thing), but I (or someone) should listen to Francis Crick and try some variation on this.
I do have one worry. In my experience, thermal shift assays have not been all that useful. But I'm probably looking at a sampling bias, because (1) this technique is often used for screening fragments, where the potencies are not very impressive, and (2) it's often broken out to be used on tricky targets that no one can figure out how to assay any other way. Neither of those are conducive to seeing strong effects; if I'd been doing it on CDK4 or something, I might have a better opinion.
With that in mind, though, I find the whole CETSA idea very interesting, and well worth following up on. Time to look for a chance to try it out!
+ TrackBacks (0) | Category: Chemical Biology | Drug Assays
March 20, 2014
Over at LifeSciVC, guest blogger Jonathan Montagu talks about small molecules in drug discovery, and how we might move beyond them. Many of the themes he hits have come up around here, understandably - figuring why (and how) some huge molecules manage to have good PK properties, exploiting "natural-product-like" chemical space (again, if we can figure out a good way to do that), working with unusual mechanisms (allosteric sites, covalent inhibitors and probes), and so on. Well worth a read, even if he's more sanguine about structure-based drug discovery than I am. Most people are, come to think of it.
His take is very similar to what I've been telling people in my "state of drug discovery" presentations (at Illinois, most recently) - that we medicinal chemists need to stretch our definitions and move into biomolecule/small molecule hybrids and the like. These things need the techniques of organic chemistry, and we should be the people supplying them. Montagu goes even further than I do, saying that ". . .I believe that small molecule chemistry, as traditionally defined and practiced, has limited utility in today’s world." That may or may not be correct at the moment, but I'm willing to bet that it's going to become more and more correct in the future. We should plan accordingly.
+ TrackBacks (0) | Category: Chemical Biology | Chemical News | Drug Development | Drug Industry History
This time last year I mentioned a particularly disturbing-looking compound, sold commercially as a so-called "selective inhibitor" of two deubiquitinase enzymes. Now, I have a fairly open mind about chemical structures, but that thing is horrible, and if it's really selective for just those two proteins, then I'm off to truck-driving school just like Mom always wanted.
Here's an enlightening look through the literature at this whole class of compound, which has appeared again and again. The trail seems to go back to this 2001 paper in Biochemistry. By 2003, you see similar motifs showing up as putative anticancer agents in cell assays, and in 2006 the scaffold above makes its appearance in all its terrible glory.
The problem is, as Jonathan Baell points out in that HTSpains.com post, that this series has apparently never really had a proper look at its SAR, or at its selectivity. It wanders through a series of publications full of on-again off-again cellular readouts, with a few tenuous conclusions drawn about its structure - and those are discarded or forgotten by the time the next paper comes around. As Baell puts it:
The dispiriting thing is that with or without critical analysis, this compound is almost certainly likely to end up with vendors as a “useful tool”, as they all do. Further, there will be dozens if not hundreds of papers out there where entirely analogous critical analyses of paper trails are possible.
The bottom line: people still don’t realize how easy it is to get a biological readout. The more subversive a compound, the more likely this is. True tools and most interesting compounds usually require a lot more medicinal chemistry and are often left behind or remain undiscovered.
Amen to that. There is way too much of this sort of thing in the med-chem literature already. I'm a big proponent of phenotypic screening, but setting up a good one is harder than setting up a good HTS, and working up the data from one is much harder than working up the data from an in vitro assay. The crazier or more reactive your "hit" seems to be, the more suspicious you should be.
The usual reply to that objection is "Tool compound!" But the standards for a tool compound, one used to investigate new biology and cellular pathways, are higher than usual. How are you going to unravel a biochemical puzzle if you're hitting nine different things, eight of which you're totally unaware of? Or skewing your assay readouts by some other effect entirely? This sort of thing happens all the time.
I can't help but think about such things when I read about a project like this one, where IBM's Watson software is going to be used to look at sequences from glioblastoma patients. That's going to be tough, but I think it's worth a look, and the Watson program seems to be just the correlation-searcher for the job. But the first thing they did was feed in piles of biochemical pathway data from the literature, and the problem is, a not insignificant proportion of that data is wrong. Statements like these are worrisome:
Over time, Watson will develop its own sense of what sources it looks at are consistently reliable. . .if the team decides to, it can start adding the full text of articles and branch out to other information sources. Between the known pathways and the scientific literature, however, IBM seems to think that Watson has a good grip on what typically goes on inside cells.
Maybe Watson can tell the rest of us, then. Because I don't know of anyone actually doing cell biology who feels that way, not if they're being honest with themselves. I wish the New York Genome Center and IBM luck in this, and I still think it's a worthwhile thing to at least try. But my guess is that it's going to be a humbling experience. Even if all the literature were correct in every detail, I think it would be one. And the literature is not correct in every detail. It has compounds like that one at the top of the entry in it, and people seem to think that they can draw conclusions from them.
+ TrackBacks (0) | Category: Biological News | Cancer | Chemical Biology | Drug Assays | The Scientific Literature
March 18, 2014
Two more papers have emerged from GSK using their DNA-encoded library platform. I'm always interested to see how this might be working out. One paper is on compounds for the tuberculosis target InhA, and the other is aimed at a lymphocyte protein-protein target, LFA-1. (I've written about this sort of thing previously here, here, and here).
Both of these have some interesting points - I'll cover the LFA-1 work in another post, though. InhA, for its part, is the target of the well-known tuberculosis drug isoniazid, and it has had (as you'd imagine) a good amount of attention over the years, especially since it's not the cleanest drug in the world (although it sure beats having tuberculosis). It's known to be a prodrug for the real active species, and there are also some nasty resistant strains out there, so there's certainly room for something better.
In this case, the GSK group apparently screened several of their DNA-encoded libraries against the target, but the paper only details what happened with one of them, the aminoproline scaffold shown. That would seem to be a pretty reasonable core, but it was one of 22 diamino acids in the library. R1 was 855 different reactants (amide formation, reductive amination, sulfonamides, ureas), and R2 was 857 of the same sorts of things, giving you, theoretically, a library of over 16 million compounds. (If you totaled up the number across the other DNA-encoded libraries, I wonder how many compounds this target saw in total?) Synthesizing a series of hits from this group off the DNA bar codes seems to have worked well, with one compound hitting in the tens of nanomolar range. (The success rate of this step is one of the things that those of us who haven't tried this technique are very interested in hearing about).
They even pulled out an InhA crystal structure with the compound shown, which really makes this one sound like a poster-child example of the whole technique (and might well be why we're reading about it in J. Med. Chem.) The main thing not to like about the structure is that it has three amides in it, but this is why one runs PK experiments, to see if having three amides is going to be a problem or not. A look at metabolic stability showed that it probably wasn't a bad starting point. Modifying those three regions gave them a glycine methyl ester at P1, which had better potency in both enzyme and cell assays. When you read through the paper, though, it appears that the team eventually had cause to regret having pursued it. A methyl ester is always under suspicion, and in this case it was justified: it wasn't stable under real-world conditions, and every attempt to modify it led to unacceptable losses in activity. It looks like they spent quite a bit of time trying to hang on to it, only to have to give up on it anyway.
In the end, the aminoproline in the middle was still intact (messing with it turned out to be a bad idea). The benzofuran was still there (nothing else was better). The pyrazole had extended from an N-methyl to an N-ethyl (nothing else was better there, either), and the P1 group was now a plain primary amide. A lot of med-chem programs work out like that - you go all around the barn and through the woods, emerging covered with mud and thorns only to find your best compound about fifteen feet away from where you started.
That compound, 65 in the paper, showed clean preliminary tox, along with good PK, potency, and selectivity. In vitro against the bacteria, it worked about as well as the fluoroquinolone moxifloxacin, which is a good level to hit. Unfortunately, when it was tried out in an actual mouse TB infection model, it did basically nothing at all. This, no doubt, is another reason that we're reading about this in J. Med. Chem.. When you read a paper from an industrial group in that journal, you're either visiting a museum or a mausoleum.
That final assay must have been a nasty moment for everyone, and you get the impression that there's still not an explanation for this major disconnect. It's hard to say if they saw it coming - had other compounds been in before, or did the team just save this assay for last and cross their fingers? But either way, the result isn't the fault of the DNA-encoded assay that provided the starting series - that, in this case, seems to have worked exactly as it was supposed to, and up to the infectious animal model study, everything looked pretty good.
+ TrackBacks (0) | Category: Chemical Biology | Drug Assays | Infectious Diseases
February 24, 2014
When we last checked in on the Great Stapled Peptide Wars, researchers from Genentech, the Walter and Eliza Hall Institute and La Trobe University (the latter two in Australia) had questioned the usefulness and activity of several stapled Bim BH3 peptides. The original researchers (Walensky et al.) had then fired back strongly, pointing out that the criticisms seemed misdirected and directing the authors back to what they thought had been well-documented principles of working with such species.
Now the WEHI/Genentech/La Trobe group (Okamoto et al.) has responded, and it doesn't look like things are going to calm down any time soon. They'd made a lot of the 20-mer stapled peptide being inactive in cells, while the reply had been that yes, that's true, as you might have learned from reading the original papers again - it was the 21-mer that was active in cells. Okamoto and co-workers now say that they've confirmed this, but only in some cell lines - there are others for which the 21-mer is still inactive. What's more, they say that a modified but un-stapled 21-mer is just as active as the closed peptide, which suggests that the stapling might not be the key factor at all.
There's another glove thrown down (again). The earlier Genentech/WEHI/La Trobe paper had shown that the 20-mer had impaired binding to a range of Bcl target proteins. Walensky's team had replied that the 20-mer had been designed to have lower affinity, thus the poor binding results. But this new paper says that the 21-mer shows similarly poor binding behavior, so that can't be right, either.
This is a really short communication, and you get the impression that it was fired off as quickly as possible after the Walensky et al. rebuttal. There will, no doubt, be a reply. One aspect of it, I'm guessing, will be that contention about the unstapled peptide activity. I believe that the Walensky side of the argument have already shown that these substituted-but-unstapled peptides can show enhanced activity, probably due to cranking up their alpha-helical character (just not all the way to stapling them into that form). We shall see.
And this blowup reflects a lot of earlier dispute about Bcl, BAX/BAK peptides, and apoptosis in general. The WEHI group and others have been arguing out the details of these interactions in print for years, and this may be just another battlefield.
+ TrackBacks (0) | Category: Chemical Biology | Drug Assays
February 21, 2014
Update: the nomenclature of these enzymes is messy - see the comments.
Here's another activity-based proteomics result that I've been meaning to link to - in this one, the Cravatt group strengthens the case for carboxylesterase 3 as a potential target for metabolic disease. From what I can see, that enzyme was first identified back in about 2004, one of who-knows-how-many others that have similar mechanisms and can hydrolyze who-knows-how-many esters and ester-like substrates. Picking your way through all those things from first principles would be a nightmare - thus the activity-based approach, where you look for interesting phenotypes and work backwards.
In this case, they were measuring adipocyte behavior, specifically differentiation and lipid accumulation. A preliminary screen suggested that there were a lot of serine hydrolase enzymes active in these cells, and a screen with around 150 structurally diverse carbamates gave several showing phenotypic changes. The next step in the process is to figure out what particular enzymes are responsible, which can be done by fluorescence labeling (since the carbamates are making covalent bonds in the enzyme active sites. They found my old friend hormone-sensitive lipase, as well they should, but there was another enzyme that wasn't so easy to identify.
One particular carbamate, the unlovely but useful WWL113, was reasonably selective for the enzyme of interest, which turned out to be the abovementioned carboxyesterase 3 (Ces3). The urea analog (which should be inactive) did indeed show no cellular readouts, and the carbamate itself was checked for other activities (such as whether it was a PPAR ligand). These established a strong connection between the inhibitor, the enzyme, and the phenotypic effects.
With that in hand, they went on to find a nicer-looking compound with even better selectivity, WWL229. (I have to say, going back to my radio-geek days in the 1970s and early 1980s, that I can't see the letters "WWL" without hearing Dixieland jazz, but that's probably not the effect the authors are looking for). Using an alkyne derivative of this compound as a probe, it appeared to label only the esterase of interest across the entire adipocyte proteome. Interestingly, though, it appears that WWL13 was more active in vivo (perhaps due to pharmacokinetic reasons?)
And those in vivo studies in mice showed that Ces3 inhibition had a number of beneficial effects on tissue and blood markers of metabolic syndrome - glucose tolerance, lipid profiles, etc. Histologically, the most striking effect was the clearance of adipose deposits from the liver (a beneficial effect indeed, and one that a number of drug companies are interested in). This recapitulates genetic modification studies in rodents targeting this enzyme, and shows that pharmacological inhibition could do the job. And while I'm willing to bet that the authors would rather have discovered a completely new enzyme target, this is solid work all by itself.
+ TrackBacks (0) | Category: Biological News | Chemical Biology | Diabetes and Obesity
January 23, 2014
I made reference to the (surprising) ability of artificially-linked "click" DNA analogs (an earlier example here) to show activity in cells when Ali Tavassoli (of U. Southampton) spoke about his lab's work back at the Challenges in Chemical Biology conference last summer.
Here's their 2012 paper on the subject, and now they have another one out in Angewandte Chemie, which extends the work to human cells. It sort of boggles me mind to think that these things are actually transcriptionally active, but they're lighting up human cells with the fluorescent dye mCherry, and doing (from what I can see) all the appropriate control experiments. (It's not being reworked by repair enzymes along the way, for example). Here's the wrap-up:
The practical limit on the length of error-free oligonu- cleotide synthesis has necessitated the use of enzymes for the assembly of polynucleotide chains into genes. However, these approaches have been constrained by the assumption that the phosphodiester backbone that links oligonucleotides is critical for the biocompatibility and cellular function of the resulting DNA. As demonstrated in this work, this is not the case. Our results strongly suggest that RNA polymerase II, the enzyme responsible for all mRNA synthesis in eukaryotes, correctly transcribes the genetic information contained on a click-linked strand of DNA. . .Our results indicate that a phosphodiester linker is not essential for joining oligonucleotides for gene synthesis and open up the possibility of replacing enzymatic ligation with highly efficient chemical reactions. This approach would not necessarily be limited to the linker reported here, and alternative chemical reactions and the resulting linkers may also be suitable for this purpose.
I look forward to seeing what use chemical biology will make of this sort of thing. Now, can you make functional mRNAs out of this as well?
+ TrackBacks (0) | Category: Chemical Biology
January 14, 2014
Here's a good paper on the design of stapled peptides, with an emphasis on what's been learned about making them cell-penetrant. It's also a specific rebuttal to a paper from Genentech (the Okamoto one referenced below) detailing problems with earlier reported stapled peptides:
In order to maximize the potential for success in designing stapled peptides for basic research and therapeutic development, a series of important considerations must be kept in mind to avoid potential pitfalls. For example, Okamoto et al. recently reported in ACS Chemical Biology that a hydrocarbon-stapled BIM BH3 peptide (BIM SAHB) manifests neither improved binding activity nor cellular penetrance compared to an unmodified BIM BH3 peptide and thereby caution that peptide stapling does not necessarily enhance affinity or biological activity. These negative results underscore an important point about peptide stapling: insertion of any one staple at any one position into any one peptide to address any one target provides no guarantee of stapling success. In this particular case, it is also noteworthy that the Walter and Eliza Hall Institute (WEHI) and Genentech co-authors based their conclusions on a construct that we previously reported was weakened by design to accomplish a specialized NMR study of a transient ligand−protein interaction and was not used in cellular studies because of its relatively low α-helicity, weak binding activity, overall negative charge, and diminished cellular penetrance. Thus, the Okamoto et al. report provides an opportunity to reinforce key learnings regarding the design and application of stapled peptides, and the biochemical and biological activities of discrete BIM SAHB peptides.
You may be able to detect the sound of teeth gritting together in that paragraph. The authors (Loren Walensky of Dana-Farber, and colleagues from Dana-Farber, Albert Einstein, Chicago, and Yale), point out that the Genentech paper took a peptide that's about 21% helical, and used a staple modification that took it up to about 39% helical, which they say is not enough to guarantee anything. They also note that when you apply this technique, you're necessarily altering two amino acids at a minimum (to make them "stapleable"), as well as adding a new piece across the surface of the peptide helix, so these changes have to be taken into account when you compare binding profiles. Some binding partners may be unaffected, some may be enhanced, and some may be wiped out.
It's the Genentech team's report of poor cellular uptake that you can tell is the most irritating feature of their paper to these authors, and from the way they make their points, you can see why:
The authors then applied this BIM SAHBA (aa 145−164) construct in cellular studies and observed no biological activity, leading to the conclusion that “BimSAHB is not inherently cell-permeable”. However, before applying stapled peptides in cellular studies, it is very important to directly measure cellular uptake of fluorophore-labeled SAHBs by a series of approaches, including FACS analysis, confocal microscopy, and fluorescence scan of electrophoresed lysates from treated cells, as we previously reported. Indeed, we did not use the BIM SAHBA (aa 145−164) peptide in cellular studies, specifically because it has relatively low α-helicity, weakened binding activity, and overall negative charge (−2), all of which combine to make this particular BIM SAHB construct a poor candidate for probing cellular activity. As indicated in our 2008 Methods in Enzymology review, “anionic species may require sequence modification (e.g., point mutagenesis, sequence shift) to dispense with negative charge”, a strategy that emerged from our earliest studies in 2004 and 2007 to optimize the cellular penetrance of stapled BID BH3 and p53 peptides for cellular and in vivo analyses and also was applied in our 2010 study involving stapled peptides modeled after the MCL-1 BH3 domain. In our 2011 Current Protocols in Chemical Biology article, we emphasized that “based on our evaluation of many series of stapled peptides, we have observed that their propensity to be taken up by cells derives from a combination of factors, including charge, hydrophobicity, and α-helical structure, with negatively charged and less structured constructs typically requiring modification to achieve cell penetrance. . .
They go on to agree with the Genentech group that the peptide they studied has poor uptake into cells, but the tell-us-something-we-don't-know tone comes through pretty clearly, I'd say. The paper goes on to detail several other publications where these authors worked out the behavior of BIM BH3 stapled peptides, saying that "By assembling our published documentation of the explicit sequence compositions of BIM SAHBs and their distinct properties and scientific applications, as also summarized in Figure 1, we hope to resolve any confusion generated by the Okamoto et al. study".
They do note that the Genentech (Okamoto) paper did use one of their optimized peptides in a supplementary experiment, which shows that they were aware of the different possibilities. That one was apparently showed no effects on the viability of mouse fibroblasts, but this new paper says that a closer look (at either their own studies or at the published literature) would have shown them that the cells were actually taking up the peptide, but were relatively resistant to its effects, which actually helps establish something of a therapeutic window.
This is a pretty sharp response, and it'll be interesting to see if the Genentech group has anything to add in their defense. Overall, the impression is that stapled peptides can indeed work, and do have potential as therapeutic agents (and are in the clinic being tested as such), but that they need careful study along the way to make sure of their properties, their pharmacokinetics, and their selectivity. Just as small molecules do, when you get down to it.
+ TrackBacks (0) | Category: Biological News | Cancer | Chemical Biology
January 8, 2014
Here's a paper that may require some recalibration in the existing literature. It reports that a widely-used tool compound LY294002, known as an inhibitor of the PI3 kinases, is also a bromodomain ligand. There seems little doubt that some of its cellular effects, depending on the assay conditions, could be due to this mode of action, rather than its kinase activity. Putting "LY294002" into a PubMed search gives you, as of this morning, 7075 hits, so surely some of these results have been muddied up a bit.
PI3K and the BRD bromodomain family are, as you'd figure, structurally unrelated, but that doesn't stop things like this from happening. Time and again, tool compounds that have been accepted as acting on System A have turned out to also hit System B, and when system Z gets discovered, turn out to hit that one, too. The point is, there are a *lot* of ligand binding sites out there, and to assume that a given compound only hits the one that you know about is unwarranted. Now, at the same time, very little progress gets made if you assume that there are no tool compounds at all, so the only thing to do is proceed alertly, and be ready to revise your conclusion. Which is how we're supposed to be working, anyway, right?
+ TrackBacks (0) | Category: Chemical Biology
December 13, 2013
The Danishefsky group has published their totally synthetic preparation of erythropoetin. This is a work that's been in progress for ten years now (here's the commentary piece on it), and it takes organic synthesis into realms that no one's quite experienced yet:
The ability to reach a molecule of the complexity of 1 by entirely chemical means provides convincing testimony about the growing power of organic synthesis. As a result of synergistic contributions from many laboratories, the aspirations of synthesis may now include, with some degree of realism, structures hitherto referred to as “biologics”— a term used to suggest accessibility only by biological means (isolation from plants, fungi, soil samples, corals, or microorganisms, or by recombinant expression). Formidable as these methods are for the discovery, development, and manufacturing of biologics, one can foresee increasing needs and opportunities for chemical synthesis to provide the first samples of homogeneous biologics. As to production, the experiments described above must be seen as very early days. . .
I can preach that one both ways, as the old story has it. I take the point about how synthesis can provide these things in more homogeneous form than biological methods can, and it can surely provide variations on them that biological systems aren't equipped to produce. At the same time, I might put my money on improving the biological methods rather than stretching organic synthesis to this point, at least in its present form. I see the tools of molecular biology as hugely powerful, but in need of customization, whereas organic synthesis can be as custom as you like, but can (so far) only reach this sort of territory by all-out efforts like Danishefsky's. In other words, I think that molecular biology has to improve less than organic chemistry has to get the most use out of such molecules.
That said, I think that the most impressive part of this impressive paper is the area where we have the fewest molecular biology tools: the synthesis of the polysaccharide side chains. Assembling the peptide part was clearly no springtime stroll (and if you read the paper, you find that they experienced the heartbreak of having to go back and redesign things when the initial assembly sequence failed). But polyglycan chemistry has been a long-standing problem (and one that Danishefsky himself has been addressing for years). I think that chemical synthesis really has a much better shot at being the method of choice there. And that should tell you what state the field is in, because synthesis of those things can be beastly. If someone manages to tame the enzymatic machinery that produces them, that'll be great, but for now, we have to make these things the organic chemistry way when we dare to make them at all.
+ TrackBacks (0) | Category: Chemical Biology | Chemical News
December 4, 2013
Here's some work that gets right to the heart of modern drug discovery: how are we supposed to deal with the variety of patients we're trying to treat? And the variety in the diseases themselves? And how does that correlate with our models of disease?
This new paper, a collaboration between eight institutions in the US and Europe, is itself a look at two other recent large efforts. One of these, the Cancer Genome Project, tested 138 anticancer drugs against 727 cell lines. Its authors said at the time (last year) that "By linking drug activity to the functional complexity of cancer genomes, systematic pharmacogenomic profiling in cancer cell lines provides a powerful biomarker discovery platform to guide rational cancer therapeutic strategies". The other study, the Cancer Cell Line Encyclopedia, tested 24 drugs against 1,036 cell lines. That one appeared at about the same time, and its authors said ". . .our results indicate that large, annotated cell-line collections may help to enable preclinical stratification schemata for anticancer agents. The generation of genetic predictions of drug response in the preclinical setting and their incorporation into cancer clinical trial design could speed the emergence of ‘personalized’ therapeutic regimens."
Well, will they? As the latest paper shows, the two earlier efforts overlap to the extent of 15 drugs, 471 cell lines, 64 genes and the expression of 12,153 genes. How well do they match up? Unfortunately, the answer is "Not too well at all". The discrepancies really come out in the drug sensitivity data. The authors tried controlling for all the variables they could think of - cell line origins, dosing protocols, assay readout technologies, methods of estimating IC50s (and/or AUCs), specific mechanistic pathways, and so on. Nothing really helped. The two studies were internally consistent, but their cross-correlation was relentlessly poor.
It gets worse. The authors tried the same sort of analysis on several drugs and cell lines themselves, and couldn't match their own data to either of the published studies. Their take on the situation:
Our analysis of these three large-scale pharmacogenomic studies points to a fundamental problem in assessment of pharmacological drug response. Although gene expression analysis has long been seen as a source of ‘noisy’ data, extensive work has led to standardized approaches to data collection and analysis and the development of robust platforms for measuring expression levels. This standardization has led to substantially higher quality, more reproducible expression data sets, and this is evident in the CCLE and CGP data where we found excellent correlation between expression profiles in cell lines profiled in both studies.
The poor correlation between drug response phenotypes is troubling and may represent a lack of standardization in experimental assays and data analysis methods. However, there may be other factors driving the discrepancy. As reported by the CGP, there was only a fair correlation (rs < 0.6) between camptothecin IC50 measurements generated at two sites using matched cell line collections and identical experimental protocols. Although this might lead to speculation that the cell lines could be the source of the observed phenotypic differences, this is highly unlikely as the gene expression profiles are well correlated between studies.
Although our analysis has been limited to common cell lines and drugs between studies, it is not unreasonable to assume that the measured pharmacogenomic response for other drugs and cell lines assayed are also questionable. Ultimately, the poor correlation in these published studies presents an obstacle to using the associated resources to build or validate predictive models of drug response. Because there is no clear concordance, predictive models of response developed using data from one study are almost guaranteed to fail when validated on data from another study, and there is no way with available data to determine which study is more accurate. This suggests that users of both data sets should be cautious in their interpretation of results derived from their analyses.
"Cautious" is one way to put it. These are the sorts of testing platforms that drug companies are using to sort out their early-stage compounds and projects, and very large amounts of time and money are riding on those decisions. What if they're gibberish? A number of warning sirens have gone off in the whole biomarker field over the last few years, and this one should be so loud that it can't be ignored. We have a lot of issues to sort out in our cell assays, and I'd advise anyone who thinks that their own data are totally solid to devote some serious thought to the possibility that they're wrong.
Here's a Nature News summary of the paper, if you don't have access. It notes that the authors of the two original studies don't necessarily agree that they conflict! I wonder if that's as much a psychological response as a statistical one. . .
+ TrackBacks (0) | Category: Biological News | Cancer | Chemical Biology | Drug Assays
November 1, 2013
Here's a paper that's just come out in JACS that's worth a look on more than one level. It describes a way to image prostate cancers in vivo by targeting the GPR receptors on the cell surfaces, which are overexpressed in these tumors. Now, this is already done, using radiolabeled bombesin peptides as ligands, but this new work brings a new dimension to the idea.
What the authors have done is targeted the cell surface with antagonists and agonists at the same time, by hooking these onto a defined molecular framework. That's poly-proline, which is both soluble and adopts a well-defined structure once it's in solution. The bombesin derivatives are attached via a click Huisgen triazole linkage, and since you can slot in an azidoproline wherever you want, this lets you vary the distance between the two peptides up and down the scale. The hope is that having both kinds of ligand going at the same time might combine their separate advantages (binding potency and uptake into the cells).
And that idea seems to work: one of the combinations (the one with about a 20A spacing between the two ligands) works noticeably better than either radiolabeled peptide alone, with greater uptake and longer half-life. I'd say that proof of concept has been achieved, and the authors are planning to extend the idea to other known cell-surface-binding oncology ligands used diagnostically and/or therapeutically. Each of these will have to be worked out empirically, since there's no way of knowing what sort of spacing will be needed, of course.
That's the second thing I wanted to emphasize about this paper. Note how quickly I ran through its basic concepts above - I hope it was intelligible, but I think that the idea (which seems well worth exploring) can be expressed pretty easily. What's striking is how quickly these sorts of things can be realized these days. We've learned more about appropriate scaffolds (one of the authors of this paper, Helma Wennemers, has put in a good amount of work on the polyproline idea). And thanks to the near-universal applicability of the "click" triazole reaction, one can assemble hybrid structures like this with a high chance of success. That's not something to take for granted - doing bespoke chemistry every time on such molecules is no fun. You find yourself getting bogged down in the details rather than getting a chance to see if the main idea is worth anything or not.
There was talk before this last Nobel season of Barry Sharpless getting a second prize for the click work. Some have said that this doesn't make sense, because the click reaction that's been used the most (azide/acetylene cycloaddition) was certainly not a new one. But did anyone else see its possibilities, or the possibilities of any such universal connector reactions? Both providing such reactions and publicizing what could be done with them have been Sharpless's contributions, and the impossible-to-keep-up-with literature using them is testimony to how much was waiting to be exploited. So how come nobody did?
+ TrackBacks (0) | Category: Cancer | Chemical Biology
September 6, 2013
Acetate is used in vivo as a starting material for all sorts of ridiculously complex natural products. So here's a neat idea: why not hijack those pathways with fluoroacetate and make fluorinated things that no one's ever seen before? That's the subject of this new paper in Science, from Michelle Chang's lab at Berkeley.
There's the complication that fluoroacetate is a well-known cellular poison, so this is going to be synthetic biology all the way. (It gets processed all the way to fluorocitrate, which is a tight enough inhibitor of aconitase to bring the whole citric acid cycle to a shuddering halt, and that's enough to do the same thing to you). There a Streptomyces species that has been found to use fluoroacetate without dying (just barely), but honestly, I think that's about it for organofluorine biology.
The paper represents a lot of painstaking work. Finding enzymes (and enzyme variants) that look like they can handle the fluorinated intermediates, expressing and purifying them, and getting them to work together ex vivo are all significant challenges. They eventually worked their way up to 6-deoxyerythronolide B synthase (DEBS), which is a natural goal since it's been the target of so much deliberate re-engineering over the years. And they've managed to produce compounds like the ones shown, which I hope are the tip of a larger fluorinated iceberg.
It turns out that you can even get away with doing this in living engineered bacteria, as long as you feed them fluoromalonate (a bit further down the chain) instead of fluoroacetate. This makes me wonder about other classes of natural products as well. Has anyone ever tried to see if terpenoids can be produced in this way? Some sort of fluorinated starting material in the mevalonate pathway, maybe? Very interesting stuff. . .
+ TrackBacks (0) | Category: Chemical Biology | Chemical News | Natural Products
August 23, 2013
We chemists have always looked at the chemical machinery of living systems with a sense of awe. A billion years of ruthless pruning (work, or die) have left us with some bizarrely efficient molecular catalysts, the enzymes that casually make and break bonds with a grace and elegance that our own techniques have trouble even approaching. The systems around DNA replication are particularly interesting, since that's one of the parts you'd expect to be under the most selection pressure (every time a cell divides, things had better work).
But we're not content with just standing around envying the polymerase chain reaction and all the rest of the machinery. Over the years, we've tried to borrow whatever we can for our own purposes - these tools are so powerful that we can't resist finding ways to do organic chemistry with them. I've got a particular weakness for these sorts of ideas myself, and I keep a large folder of papers (electronic, these days) on the subject.
So I was interested to have a reader send along this work, which I'd missed when it came out on PLOSONE. It's from Pehr Harbury's group at Stanford, and it's in the DNA-linked-small-molecule category (which I've written about, in other cases, here and here). Here's a good look at the pluses and minuses of this idea:
However, with increasing library complexity, the task of identifying useful ligands (the ‘‘needles in the haystack’’) has become increasingly difficult. In favorable cases, a bulk selection for binding to a target can enrich a ligand from non-ligands by about 1000-fold. Given a starting library of 1010 to 1015 different compounds, an enriched ligand will be present at only 1 part in 107 to 1 part in 1012. Confidently detecting such rare molecules is hard, even with the application of next-generation sequencing techniques. The problem is exacerbated when biologically-relevant selections with fold-enrichments much smaller than 1000-fold are utilized.
Ideally, it would be possible to evolve small-molecule ligands out of DNA-linked chemical libraries in exactly the same way that biopolymer ligands are evolved from nucleic acid and protein libraries. In vitro evolution techniques overcome the ‘‘needle in the haystack’’ problem because they utilize multiple rounds of selection, reproductive amplification and library re-synthesis. Repetition provides unbounded fold-enrichments, even for inherently noisy selections. However, repetition also requires populations that can self-replicate.
That it does, and that's really the Holy Grail of evolution-linked organic synthesis - being able to harness the whole process. In this sort of system, we're talking about using the DNA itself as a physical prod for chemical reactivity. That's also been a hot field, and I've written about some examples from the Liu lab at Harvard here, here, and here. But in this case, the DNA chemistry is being done with all the other enzymatic machinery in place:
The DNA brings an incipient small molecule and suitable chemical building blocks into physical proximity and induces covalent bond formation between them. In so doing, the naked DNA functions as a gene: it orchestrates the assembly of a corresponding small molecule gene product. DNA genes that program highly fit small molecules can be enriched by selection, replicated by PCR, and then re-translated into DNA-linked chemical progeny. Whereas the Lerner-Brenner style DNA-linked small-molecule libraries are sterile and can only be subjected to selective pressure over one generation, DNA-programmed libraries produce many generations of offspring suitable for breeding.
The scheme below shows how this looks. You take a wide variety of DNA sequences, and have them each attached to some small-molecule handle (like a primary amine). You then partition these out into groups by using resins that are derivatized with oligonucleotide sequences, and you plate these out into 384-well format. While the DNA end is stuck to the resin, you do chemistry on the amine end (and the resin attachment lets you get away with stuff that would normally not work if the whole DNA-attached thing had to be in solution). You put a different reacting partner in each of the 384 wells, just like in the good ol' combichem split/pool days, just with DNA as the physical separation mechanism.
In this case, the group used 240-base-pair DNA sequences, two hundred seventeen billion of them. That sentence is where you really step off the edge into molecular biology, because without its tools, generating that many different species, efficiently and in usable form, is pretty much out of the question with current technology. That's five different coding sequences, in their scheme, with 384 different ones in each of the first four (designated A through D), and ten in the last one, E. How diverse was this, really? Get ready for more molecular biology tools:
We determined the sequence of 4.6 million distinct genes from the assembled library to characterize how well it covered ‘‘genetic space’’. Ninety-seven percent of the gene sequences occurred only once (the mean sequence count was 1.03), and the most abundant gene sequence occurred one hundred times. Every possible codon was observed at each coding position. Codon usage, however, deviated significantly from an expectation of random sampling with equal probability. The codon usage histograms followed a log-normal distribution, with one standard deviation in log- likelihood corresponding to two-to-three fold differences in codon frequency. Importantly, no correlation existed between codon identities at any pair of coding positions. Thus, the likelihood of any particular gene sequence can be well approxi- mated by the product of the likelihoods of its constituent codons. Based on this approximation, 36% of all possible genes would be present at 100 copies or more in a 10 picomole aliquot of library material, 78% of the genes would be present at 10 copies or more, and 4% of the genes would be absent. A typical selection experiment (10 picomoles of starting material) would thus sample most of the attainable diversity.
The group had done something similar before with 80-codon DNA sequences, but this system has 1546, which is a different beast. But it seems to work pretty well. Control experiments showed that the hybridization specificity remained high, and that the micro/meso fluidic platform being used could return products with high yield. A test run also gave them confidence in the system: they set up a run with all the codons except one specific dropout (C37), and also prepared a "short gene", containing the C37 codon, but lacking the whole D area (200 base pairs instead of 240). When they mixed that in with the drop-out library (in a ratio of 1 to 384), and split that out onto a C-codon-attaching array of beads. They then did the chemical step, attaching one peptoid piece onto all of them except the C37 binding well - that one got biotin hydrazide instead. Running the lot of them past streptavidin took the ratio of the C37-containing ones from 1:384 to something over 35:1, an enhancement of at least 13,000-fold. (Subcloning and sequencing of 20 isolates showed they all had the C37 short gene in them, as you'd expect).
They then set up a three-step coupling of peptoid building blocks on a specific codon sequence, and this returned very good yields and specificities. (They used a fluorescein-tagged gene and digested the product with PDE1 before analyzing them at each step, which ate the DNA tags off of them to facilitate detection). The door, then, would now seem to be open:
Exploration of large chemical spaces for molecules with novel and desired activities will continue to be a useful approach in academic studies and pharmaceutical investigations. Towards this end, DNA-programmed combinatorial chemistry facilitates a more rapid and efficient search process over a larger chemical space than does conventional high-throughput screening. However, for DNA-programmed combinatorial chemistry to be widely adopted, a high-fidelity, robust and general translation system must be available. This paper demonstrates a solution to that challenge.
The parallel chemical translation process described above is flexible. The devices and procedures are modular and can be used to divide a degenerate DNA population into a number of distinct sub-pools ranging from 1 to 384 at each step. This coding capacity opens the door for a wealth of chemical options and for the inclusion of diversity elements with widely varying size, hydrophobicity, charge, rigidity, aromaticity, and heteroatom content, allowing the search for ligands in a ‘‘hypothesis-free’’ fashion. Alternatively, the capacity can be used to elaborate a variety of subtle changes to a known compound and exhaustively probe structure-activity relationships. In this case, some elements in a synthetic scheme can be diversified while others are conserved (for example, chemical elements known to have a particular structural or electrostatic constraint, modular chemical fragments that independently bind to a protein target, metal chelating functional groups, fluorophores). By facilitating the synthesis and testing of varied chemical collections, the tools and methods reported here should accelerate the application of ‘‘designer’’ small molecules to problems in basic science, industrial chemistry and medicine.
Anyone want to step through? If GSK is getting some of their DNA-coded screening to work (or at least telling us about the examples that did?), could this be a useful platform as well? Thoughts welcome in the comments.
+ TrackBacks (0) | Category: Chemical Biology | Chemical News | Drug Assays
July 31, 2013
Evolutionary and genetic processes fascinate many organic chemists, and with good reason. They've provided us with the greatest set of chemical catalysts we know of: enzymes, which are a working example of molecular-level nanotechnology, right in front of us. A billion years of random tinkering have accomplished a great deal, but (being human) we look at the results and wonder if we couldn't do things a bit differently, with other aims in mind than "survive or die".
This has been a big field over the years, and it's getting bigger all the time. There are companies out there that will try to evolve enzymes for you (here's one of the most famous examples), and many academic labs have tried their hands at it as well. The two main routes are random mutations and structure-based directed changes - and at this point, I think it's safe to say that any successful directed-enzyme project has to take advantage of both. There can be just too many possible changes to let random mutations do all the work for you (20 to the Xth power gets out of hand pretty quickly, and that's just the natural amino acids), and we're usually not smart enough to step in and purposefully tweak things for the better every time.
Here's a new paper that illustrates why the field is so interesting, and so tricky. The team (a collaboration between the University of Washington and the ETH in Zürich) has been trying to design a better retro-aldolase enzyme, with earlier results reported here. That was already quite an advance (15,000x rate enhancement over background), but that's still nowhere near natural enzymes of this class. So they took that species as a starting point and did more random mutations around the active site, with rounds of screening in between, which is how we mere humans have to exert selection pressure. This gave a new variant with another lysine in the active site, which some aldolases have already. Further mutational rounds (error-prone PCR and DNA shuffling) and screening let to a further variant that was over 4000x faster than the original enzyme.
But when the team obtained X-ray structures of this enzyme in complex with an inhibitor, they got a surprise. The active site, which had already changed around quite a bit with the addition of that extra lysine, was now a completely different place. A new substrate-binding pocket had formed, and the new lysine was now the catalytic residue all by itself. The paper proposes that the mechanistic competition between the possible active-site residues was a key factor, and they theorize that many natural enzymes may have evolved through similar paths. But given this, there are other questions:
The dramatic changes observed during RA95 evolution naturally prompt the question of whether generation of a highly active retro-aldolase required a computational design step. Whereas productive evolutionary trajectories might have been initiated from random libraries, recent experiments with the same scaffold dem- onstrate that chemical instruction conferred by computation greatly increases the probability of identifying catalysts. Although the programmed mechanisms of other computationally designed enzymes have been generally reinforced and refined by directed evolution, the molecular acrobatics observed with RA95 attest to the functional leaps that unanticipated, innovative mutations—here, replacement of Thr83 by lysine—can initiate.
So they're not ready to turn off the software just yet. But you have to wonder - if there were some way to run the random-mutation process more quickly, and reduce the time and effort of the mutation/screening/selection loop, computational design might well end up playing a much smaller role. (See here for more thoughts on this). Enzymes are capable of things that we would never think of ourselves, and we should always give them the chance to surprise us when we can.
+ TrackBacks (0) | Category: Chemical Biology | In Silico
July 25, 2013
Ben Cravatt is talking about this work on activity-based protein profiling of serine hydrolase enzymes. That's quite a class to work on - as he says, up to 2% of all the proteins in the body fall into this group, but only half of them have had even the most cursory bit of characterization. Even among the "known" ones, most of their activities are still dark, and only 10% of them have useful pharmacological tools.
He's detailed a compound (PF-3845) that Pfizer found as a screening hit for FAAH, which although it looked benign, turned out to be a covalent inhibitor due to a reactive arylurea. Pfizer, he says, backed off when this mechanism was uncovered - they weren't ready at the time for covalency, but he says that they've loosened up since then. Studying the compound in various tissues, including the brain, showed that it was extremely selective for FAAH.
Another reactive compound, JZL184, is an inhibitor of monoacylglycerol hydrolase (MAGL). Turns out that its carbamate group also reacts with FAAH, but there's a 300-fold window in the potency. The problem is, that's not enough. In mouse models, hitting both enzymes at the same time leads to behavioral problems. Changing the leaving group to a slightly less reactive (and nonaromatic) hexafluoroisopropanol, though, made the compound selective again. I found this quite interesting - most of the time, you'd think that 300x is plenty of room, but apparently not. That doesn't make things any easier, does it?
In response to a question (from me), he says that covalency is what makes this tricky. The half-life of the brain enzymes is some 12 to 14 hours, so by the time the next once-a-day dose comes in, there's still 20 or 30% of the enzyme still shut down, and things get out of hand pretty soon. For a covalent mechanism, he recommends 2000-fold or 5000-fold. On the other hand, he says that when they've had a serine hydrolase-targeted compound, they've never seen it react out of that class (targeting cysteine residues, though, is a very different story). And the covalent mechanism gives you some unique opportunities - for example, deliberate engineering a short half-life, because that might be all you need.
+ TrackBacks (0) | Category: Chemical Biology | The Central Nervous System
Kurt Deshayes of Genentech has been speaking at the Challenges in Chemical Biology meeting, on protein-protein inhibitor work. And he's raised a number of issues that I think that we in drug discovery are going to have to deal with. For one thing, given the size of PPI clinical molecules like ABT-199, what does that tell us about what makes an orally available molecule? (And what does that tell us about what we think we know about the subject?) You'd think that many (most?) protein-protein inhibitors will be on the large side, and if you were to be doctrinaire about biophysical properties, you wouldn't go there at all. But it can be done - the question is, how often? And how do you increase your chances of success? I don't think that anyone doubts that more molecules with molecular weights of 1000 will have PK trouble than those with molecular weights of 300. So how do you lengthen the odds?
Another point he emphasized is that Genentech's work on XIAP led them to activities that they never would have guessed up front. The system, he points out, is just too complicated to make useful predictions. You have to go in an perturb it and see what happens (and small molecules are a great way to do that). I'd say that this same principle applies to most everything in biochemistry: get in and mess with the system, and let it tell you what's going on.
+ TrackBacks (0) | Category: Chemical Biology
Ali Tavassoli has just given a very interesting talk at the Challenges in Chemical Biology conference on his SICLOPPS method for generating huge numbers of cyclic peptides to screen for inhibitors of protein-protein interactions. I'll do a post in detail on that soon; it's one of those topics I've been wanting to tackle. His lab is applying this to a wide range of PPI systems.
But he had a neat update on another topic, as well. His group has made triazole-linked DNA sequences, and investigated how they behave in bacteria. He now reports that these things are biocompatible in mammalian cells (MCF-7).
This opens up some very interesting artificial-gene ideas, and I look forward to seeing what people can make of it. The extent to which DNA can be modified by things like triazole linkages is remarkable (see here and here for other examples). What else is possible?
+ TrackBacks (0) | Category: Chemical Biology
Kevan Shokat is now talking about his lab's work on using Drosophila models for kinase inhibitor discovery in oncology. I always like hearing about this sort of thing; very small living models have a lot of appeal for drug discovery.
You'd think that screening in fruit flies would be problematic for understanding human efficacy, but if you pick your targets carefully, you can get it to work. In Shokat's case, he's looking at a kinase called Ret, which is a target in thyroid cancer and is quite highly conserved across species. They set up a screen where active compounds would rescue a lethal phenotype (which gives you a nice high signal-to-noise), and screened about a thousand likely kinase inhibitor molecules.
Here's the paper that discusses much of what Shokat's group found. It turned out that Ret kinase inhibition alone was not the answer - closely related compounds with very similar Ret activity had totally different phenotypes in the flies. The key was realizing that some of them were hitting and missing other kinases in the pathways (specifically Raf and TOR) that could cancel out (or enhance) the effects. This was a very nice job of direct discovery of the right sort of kinase fingerprint needed for a desired effect. We need more tiny critters for screens like these.
+ TrackBacks (0) | Category: Cancer | Chemical Biology
July 24, 2013
Now Udo Opperman is talking about histone modifications, which takes us into epigenetics. Whatever it is, epigenetics seems to be a big topic at this meeting - there are several talks and many posters addressing this area.
His efforts at Oxford and the Structural Genomics Consortium are towards generating chemical tools for all the histone-modifying enzymes (methylases/demethylases, acetylases/deacetylases, and so on). That covers a lot of ground, and a number of different mechanisms. To make things harder, they're going for tens of nanomolar in potency and high selectivity - but if these compounds are going to be really useful, that's the profile that they'll need.
One of the things that's coming up as these compounds become available is that these enzymes aren't necessarily confined to histones. Why shouldn't lysines, etc., on other proteins also be targets for regulation? Studies are just getting started on this, and it could well be that there are whole signaling networks out there that we haven't really appreciated.
+ TrackBacks (0) | Category: Chemical Biology
I'm listening to Stuart Schreiber make his case for diversity-oriented synthesis (DOS) as a way to interrogate biochemistry. I've written about this idea a number of times here, but I'm always glad to hear the pitch right from the source.
Schreiber's team has about 100,000 compounds from DOS now, all of which are searchable at PubChem. He says that they have about 15mg of each of them in the archives, which is a pretty solid collection. They've been trying to maximize the biochemical diversity of their screening (see here and here for examples), and they're also (as noted here) building up a collection of fragments, which he says will be used for high-concentration screening.
He's also updating some efforts with the Gates Foundation to do cell-based antimalarial screening with the DOS compounds. They have 468 compounds that they're now concentrating on, and checking these against resistant strains indicates that some of them may well be working through unusual mechanisms (others, of course, are apparently hitting the known ones). He's showing structures, and they are very DOSsy indeed - macrocycles, spiro rings, chirality all over. But since these assay are done in cells, some large hoops have already been jumped through.
He's also talking about the Broad Institutes efforts to profile small-molecule behavior in numerous tumor cell lines. Here's a new public portal site on this, and there's apparently a paper accepted at Cell on it as well. They have hundreds of cell lines, from all sorts of sources, and are testing those against an "informer set" of small-molecule probes and known drugs. They're trying to make this a collection of very selective compounds, targeting a wide variety of different targets throughout the cell. There are kinase inhibitors, epigenetic compounds, and a long list of known oncology candidates, as well as many other compounds that don't hit obvious cancer targets.
They're finding out a lot of interesting things about target ID with this set. Schreiber says that this work has made him more interested in gene expression profiles than in mutations per se. Here, he says, is an example of what he's talking about. Another example is the recent report of the natural product austocystin, which seems to be activated by CYP metabolism. The Broad platform has identified CYP2J2 as the likely candidate.
There's an awful lot of work on these slides (and an awful lot of funding is apparent, too). I think that the "Cancer Therapeutics Response Portal" mentioned above is well worth checking out - I'll be rooting through it after the meeting.
+ TrackBacks (0) | Category: Cancer | Chemical Biology | Infectious Diseases
June 18, 2013
Natural products come up around here fairly often, as sources of chemical diversity and inspiration. Here's a paper that combines them with another topic (epigenetics) that's been popular around here as well, even if there's some disagreement about what the word means.
A group of Japanese researchers were looking at the natural products derived from a fungus (Chaetomium indicum). Recent work has suggested that fungi have a lot more genes/enzymes available to make such things than are commonly expressed, so in this work, the team fed the fungus an HDAC inhibitor to kick its expression profile around a bit. The paper has a few references to other examples of this technique, and it worked again here - they got a significantly larger amount of polyketide products out of the fermentation, included several that had never been described before.
There have been many attempts to rejigger the synthetic machinery in natural-product-producing organisms, ranging from changing their diet of starting materials, adding environmental stresses to their culture, all the way to manipulating their actual
genomic sequences directly. This method has the advantage of being easier than most, and the number of potential gene-expression-changing compounds is large. Histone deacetylase inhibitors alone have wide ranges of selectivity against members of the class, and then you have the reverse mechanism (histone actyltranferase), methyltransferase and demethylase inhibitors, and many more. These should be sufficient to produce weirdo compounds a-plenty.
+ TrackBacks (0) | Category: Chemical Biology | Natural Products
June 3, 2013
Here's a worthwhile paper from Donna Huryn, Lynn Resnick, and Peter Wipf on the academic contributions to chemical biology in recent years. They're not only listing what's been done, they're looking at the pluses and minuses of going after probe/tool compounds in this setting:
The academic setting provides a unique environment distinct from traditional pharmaceutical or biotechnology companies, which may foster success and long-term value of certain types of probe discovery projects while proving unsuitable for others. The ability to launch exploratory high risk and high novelty projects from both chemistry and biology perspectives, for example, testing the potential of unconventional chemotypes such as organometallic complexes, is one such distinction. Other advantages include the ability to work without overly constrained deadlines and to pursue projects that are not expected to reap commercial rewards, criteria and constraints that are common in “big pharma.” Furthermore, projects to identify tool molecules in an academic setting often benefit from access to unique and highly specialized biological assays and/or synthetic chemistry expertise that emerge from innovative basic science discoveries. Indeed, recent data show that the portfolios of academic drug discovery centers contain a larger percentage of long-term, high-risk projects compared to the pharmaceutical industry. In addition, many centers focus more strongly on orphan diseases and disorders of third world countries than commercial research organizations. In contrast, programs that might be less successful in an academic setting are those that require significant resources (personnel, equipment, and funding) that may be difficult to sustain in a university setting. Projects whose goals are not consistent with the educational mission of the university and cannot provide appropriate training and/or content for publications or theses would also be better suited for a commercial enterprise.
Well put. You have to choose carefully (just as commercial enterprises have to), but there are real opportunities to do something that's useful, interesting, and probably wouldn't be done anywhere else. The examples in this paper are sensors of reactive oxygen species, a GPR30 ligand, HSP70 ligands, an unusual CB2 agonist (among other things), and a probe of beta-amyloid.
I agree completely with the authors' conclusion - there's plenty of work for everyone:
By continuing to take advantage of the special expertise resident in university settings and the ability to pursue novel projects that may have limited commercial value, probes from academic researchers can continue to provide valuable tools for biomedical researchers. Furthermore, the current environment in the commercial drug discovery arena may lead to even greater reliance on academia for identifying suitable probe and lead structures and other tools to interrogate biological phenomena. We believe that the collaboration of chemists who apply sound chemical concepts and innovative structural design, biologists who are fully committed to mechanism of action studies, institutions that understand portfolio building and risk sharing in IP licensing, and funding mechanisms dedicated to provide resources leading to the launch of phase 1 studies will provide many future successful case studies toward novel therapeutic breakthroughs.
But it's worth remembered that bad chemical biology is as bad as anything in the business. You have the chance to be useless in two fields at once, and bore people across a whole swath of science. Getting a good probe compound is not like sitting around waiting for the dessert cart to come - there's a lot of chemistry to be done, and some biology that's going to be tricky almost by definition. The examples in this paper should spur people on to do the good stuff.
+ TrackBacks (0) | Category: Chemical Biology
April 18, 2013
Just as a quick example of how odd molecular recognition can be, have a look at this paper from Chemical Communications. It's not particularly remarkable, but it's a good example of what's possible. The authors used a commercial phage display library (this one, I think) to run about a billion different 12-mer peptides past the simple aromatic hydrocarbon naphthalene (immobilized on a surface via 2-napthylamine). The usual phage-library techniques (several rounds of infection into E. coli followed by more selectivity testing against bound naphthalene and against control surfaces with no ligand) gave a specific 12-mer peptide. It's HFTFPQQQPPRP, for those who'd like to make some. Note: I typo-ed that sequence the first time around, giving it only one phenylalanine, unhelpfully.
Now, an oligopeptide isn't the first thing you'd imagine being a selective binder to a simple aromatic hydrocarbon, but this one not only binds naphthalene, but it has good selectivity versus benzene (34-fold), while anthracene and pyrene weren't bound at all. From the sequence above, those of you who are peptide geeks will have already figured out roughly how it does it: the phenylalanines are pi-stacking, while the proline(s) make a beta-turn structure. Guessing that up front would still not have helped you sort through the possibilities, it's safe to say, since that still leaves you with quite a few.
But the starting phage library itself doesn't cover all that much diversity. Consider 20 amino acids at twelve positions: 4.096 times ten to the fifteenth. The commercial library covers less than one millionth of the possible oligopeptide space, and we're completely ignoring disulfide bridges. To apply the well-known description from the Hitchhiker's Guide to the Galaxy, chemical space is big. "Really big. You just won't believe how vastly, hugely, mindbogglingly big it is. . ."
+ TrackBacks (0) | Category: Chemical Biology
April 1, 2013
Nature Chemical Biology has an entire issue on target selection and target validation, and it looks well worth a read. I'll have more to say about some of the articles in it, but I wanted to mention a point that comes up in the introductory comment, "Stay On Target". This is the key point: "Chemical probes and drugs are fundamentally distinct entities".
A drug-company scientist's first reaction might be (as mine was) to think "That's true. The bar is higher for drugs". But the editorial goes on to say that this isn't the case, actually:
For example, multiple authors emphasize that when it comes to in-cell selectivity between on- and off-target activity, chemical probes should be held to a higher standard than drugs, as clinical responses may in fact improve from off-target activity (via polypharmacology), whereas the interpretation of biological responses to chemical probes requires the deconvolution of outcomes associated with on- and off-target activities.
They're right. A drug is defined by its effects in a living creature (I'm tempted to add "Preferably, one that is willing to pay for it"). A chemical probe, on the other hand, is defined by its specificity. It's important not to confuse the two - you can get all excited about how specific your drug candidate is, how exquisitely it hits its target, but (as we have proven over and over in this business) that means nothing if hitting that target isn't clinically meaningful. Being impressed by the specificity of a chemical probe compound, on the other hand, is entirely appropriate - but no one should think that this makes it closer to being a drug.
These concepts came up at the EMBL Chemical Biology meeting I attended last fall, and anyone doing work in the field would do well to keep them in mind. If you don't, you risk producing the worst sorts of compounds. On one end of the spectrum, you have the wonderfully selective compound that has eaten up vast amounts of money in development costs, but does nothing that anyone finds useful. And on the other end of that scale, you have so-called probe compounds that probably hit all sorts of other things, rendering any results in any system past a single purified protein suspect. Stay out of both of those mudpits if you can.
+ TrackBacks (0) | Category: Chemical Biology
March 27, 2013
I wrote here about DNA-barcoding of huge (massively, crazily huge) combichem libraries, a technology that apparently works, although one can think of a lot of reasons why it shouldn't. This is something that GlaxoSmithKline bought by acquiring Praecis some years ago, and there are others working in the same space.
For outsiders, the question has long been "What's come out of this work?" And there is now at least one answer, published in a place where one might not notice it: this paper in Prostaglandins and Other Lipid Mediators. It's not a journal whose contents I regularly scan. But this is a paper from GSK on a soluble epoxide hydrolase inhibitor, and therein one finds:
sEH inhibitors were identified by screening large libraries of drug-like molecules, each attached to a DNA “bar code”, utilizing DNA-encoded library technology  developed by Praecis Pharmaceuticals, now part of GlaxoSmithKline. The initial hits were then synthesized off of DNA, and hit-to-lead chemistry was carried out to identify key features of the sEH pharmacophore. The lead series were then optimized for potency at the target, selectivity and developability parameters such as aqueous solubility and oral bioavailability, resulting in GSK2256294A. . .
That's the sum of the med-chem in the article, which certainly compresses things, and I hope that we see a more complete writeup at some point from a chemistry perspective. Looking at the structure, though, this is a triaminotriazine-derived compound (as in the earlier work linked to in the first paragraph), so yes, you apparently can get interesting leads that way. How different this compound is from the screening hit is a good question, but it's noteworthy that a diaminotriazine's worth of its heritage is still present. Perhaps we'll eventually see the results of the later-generation chemistry (non-triazine).
+ TrackBacks (0) | Category: Chemical Biology | Chemical News | Drug Assays | Drug Development
March 20, 2013
Here's an ingenious use for DNA that never would have occurred to me. David Liu and co-workers have been using DNA-templated reactions for some time, though, so it's the sort of thing that would have occurred to them: using the information of a DNA sequence to make other kinds of polymers entirely.
The schematic above gives you the idea. Each substrate has a peptide nucleic acid (PNA) pentamer, which recognizes a particular DNA codon, and some sort of small-molecule monomer piece for the eventual polymer, with cleavable linkers holding these two domains together. The idea is that when these things line up on the DNA, their reactive ends will be placed in proximity to each other, setting up the bond formation in the order that you want.
Even so, they found that if you use building blocks whose ends can react with each other intramolecularly (A----B), they tend to do that as a side reaction and mess things up. So the most successful runs had an A----A type compound on one codon, with a B----B one on the next, and so on. So what chemical reactions were suitable? Amide formation didn't get very far, and reductive amination failed completely. Hydrazone and oxime formation actually worked, though, although you can tell that Liu et al. weren't too exciting about pursuing that avenue much further. But the good ol' copper-catalyzed acetylene/azide "click" reaction came through, and appears to have been the most reliable of all.
That platform was used to work out some of the other features of the system. Chain length on the individual pieces turned out not to be too big a factor (Whitesides may have been right again on this one). A nice mix-and-match experiment with various azides and acetylenes on different PNA codon recognition sequences showed that the DNA was indeed templating things the in the way that you would expect from molecular recognition. Pushing the system by putting rather densely functionalized spacers (beta-peptide sequences) in the A----A and B----B motifs also worked well, as did pushing things to make 4-, 8-, and even 16-mers. By the end, they'd produced completely defined triazole-linked beta-peptide polymers of 90 residues, with a molecular weight of 26 kD, which pushes things into the realm of biomolecular sizes.
You can, as it turns out, take a sample of such a beast (with the DNA still attached) and subject it to PCR, amplifying your template again. That's important, because it's the sort of thing you could imagine doing with a library of these things, using some sort of in vitro selection criterion for activity, and then identifying the sequence of the best one by using the attached DNA as a bar-code readout. This begins to give access to a number of large and potentially bioactive molecules that otherwise would be basically impossible to synthesize in any defined form. Getting started is not trivial, but once you get things going, it looks like you could generate a lot of unusual stuff. I look forward to seeing people take up the challenge!
+ TrackBacks (0) | Category: Chemical Biology
February 27, 2013
There's an interesting addendum to yesterday's post about natural product fragments. Dan Erlanson was pointing out that many of the proposed fragments were PAINS, and that prompted Jonathan Baell (author of the original PAINS paper) to leave a comment there mentioning this compound. Yep, you can buy that beast from Millipore, and it's being sold as a selective inhibitor of two particular enzymes. (Here's the original paper describing it). If it's really that selective, I will join one of those Greek monasteries where they eat raw onions and dry bread, and spend my time in atonement for ever thinking that a double nitrophenyl
Schiff base enone with an acrylamide on it might be trouble.
Honestly, guys. Do a Ben Cravatt-style experiment across a proteome with that think, and see what you get. I'm not saying that it's going to absolutely label everything it comes across, but it's surely going to stick to more than two things, and have more effects than you can ascribe to those "selective" actions.
+ TrackBacks (0) | Category: Chemical Biology | Drug Assays
February 1, 2013
The short answer is "by looking for compounds that grow beta cells". That's the subject of this paper, a collaboration between Peter Schulz's group, the Novartis GNF. Schultz's group has already published on cell-based phenotypic screens in this area, where they're looking for compounds that could be useful in restoring islet function in patient with Type I diabetes.
These studies have used a rat beta-cell line (R7T1) that can be cultured, and they do good ol' phenotypic screening to look for compounds that induce proliferation (while not inducing it across the board in other cell types, of course). I'm a big fan of such approaches, but this is a good time to mention their limitations. You'll notice a couple of key words in that first sentence, namely "rat" and "cultured". Rat cells are not human cells, and cell lines that can be grown in vitro are not like primary cells from a living organism, either. If you base your entire approach this way, you run the risk of finding compounds that will, well, only work on rat cells in a dish. The key is to shift to the real thing as quickly as possible, to validate the whole idea.
That's what this paper does. The team has also developed an assay with primary human beta cells (which must be rather difficult to obtain), which are dispersed and plated. The tricky part seems to be keeping the plates from filling up with fibroblast cells, which are rather like the weeds of the cell culture world. In this case, their new lead compound (a rather leggy beast called WS-6) induced proliferation of both rat and human cells.
They took it on to an even more real-world system, mice that had been engineered to have a switchable defect in their own beta cells. Turning these animals diabetic, followed by treatment with the identified molecule (5 mpk, every other day), showed that it significantly lowered glucose levels compared to controls. And biopsies showed significantly increases beta-cell mass in the treated animals - all together, about as stringent a test as you can come up with in Type I studies.
So how does WS6 accomplish this? The paper goes further into affinity experiments with a biotinylated version of the molecule, which pulled down both the kinase IKK-epsilon and another target, Erb3 binding protein-1 (EBP1). An IKK inhibitor had no effect in the cell assay, interestingly, while siRNA experiments for EBP1 showed that knocking it down could induce proliferation. Doing both at the same time, though, had the most robust effect of all. The connection looks pretty solid.
Now, is WS6 a drug? Not at all - here's the conclusion of the paper:
In summary, we have identified a novel small molecule capable of inducing proliferation of pancreatic β cells. WS6 is among a few agents reported to cause proliferation of β cells in vitro or in vivo. While the extensive medicinal chemistry that would be required to improve the selectivity, efficacy, and tolerability of WS6 is beyond the scope of this work, further optimization of WS6 may lead to an agent capable of promoting β cell regeneration that could ultimately be a key component of combinatorial therapy for this complex disease.
Exactly so. This is excellent, high-quality academic research, and just the sort of thing I love to see. It tells us useful, actionable things that we didn't know about an important disease area, and it opens the door for a real drug discovery effort. You can't ask for more than that.
+ TrackBacks (0) | Category: Chemical Biology | Diabetes and Obesity | Drug Assays
January 17, 2013
Here's a recent paper in J. Med. Chem. on halogen bonding in medicinal chemistry. I find the topic interesting, because it's an effect that certainly appears to be real, but is rarely (if ever) exploited in any kind of systematic way.
Halogens, especially the lighter fluorine and chlorine, are widely used substituents in medicinal chemistry. Until recently, they were merely perceived as hydrophobic moieties and Lewis bases in accordance with their electronegativities. Much in contrast to this perception, compounds containing chlorine, bromine, or iodine can also form directed close contacts of the type R–X···Y–R′, where the halogen X acts as a Lewis acid and Y can be any electron donor moiety. . .
What seems to be happening is that the electron density around the halogen atom is not as smooth as most of us picture it. You'd imagine a solid cloud of electrons around the bromine atom of a bromoaromatic, but in reality, there seems to be a region of slight positivecharge (the "sigma hole") out on the far end. (As a side effect, this give you more of a circular stripe of negative charge as well). Both these effects have been observed experimentally.
Now, you're not going to see this with fluorine; that one is more like most of us picture it (and to be honest, fluorine's weird enough already). But as you get heavier, things become more pronounced. That gives me (and probably a lot of you) an uneasy feeling, because traditionally we've been leery of putting the heavier halogens into our molecules. "Too much weight and too much hydrophobicity for too little payback" has been the usual thinking, and often that's true. But it seems that these substituents can actually earn out their advance in some cases, and we should be ready to exploit those, because we need all the help we can get.
Interestingly, you can increase the effect by adding more fluorines to the haloaromatic, which emphasizes the sigma hole. So you have that option, or you can take a deep breath, close your eyes, and consider. . .iodos:
Interestingly, the introduction of two fluorines into a chlorobenzene scaffold makes the halogen bond strength comparable to that of unsubstituted bromobenzene, and 1,3-difluoro-5-bromobenzene and unsubstituted iodobenzene also have a comparable halogen bond strength. While bromo and chloro groups are widely employed substituents in current medicinal chemistry, iodo groups are often perceived as problematic. Substituting an iodoarene core by a substituted bromoarene scaffold might therefore be a feasible strategy to retain affinity by tuning the Br···LB (Lewis base) halogen bond to similar levels as the original I···LB halogen bond.
As someone who values ligand efficiency, the idea of putting in an iodine gives me the shivers. A fluoro-bromo combo doesn't seem much more attractive, although almost anything looks good compared to a single atom that adds 127 mass units at a single whack. But I might have to learn to love one someday.
The paper includes a number of examples of groups that seem to be capable of interacting with halogens, and some specific success stories from recent literature. It's probably worth thinking about these things similarly to the way we think about hydrogen bonds - valuable, but hard to obtain on purpose. They're both directional, and trying to pick up either one can cause more harm than good if you miss. But keep an eye out for something in your binding site that might like a bit of positive charge poking at it. Because I can bet that you never thought to address it with a bromine atom!
Update: in the spirit of scientific inquiry, I've just sent in an iodo intermediate from my current work for testing in the primary assay. It's not something I would have considered doing otherwise, but if anyone gives me any grief, I'll tell them that it's 2013 already and I'm following the latest trends in medicinal chemistry.
+ TrackBacks (0) | Category: Chemical Biology | Chemical News | In Silico
December 19, 2012
Well, I've been away from the computer a good part of the day, but I return to find that the author of the NSF press release that I spoke unkindly of has shown up in the comments to that post. I'm going to bring those up here to make sure that his objections get a fair hearing:
I wrote this press release, and I am a bit concerned that instead of discussing the research with myself, or more importantly the researchers, you decide to attack the text.
We presented information based on research that has been underway for some time, at least two years with NSF peer-reviewed support.
Additionally, we were careful to not overstate either the technology or the impact, but to present an illustration of what the technology can do in the limited space that a press release allows.
A journalist is expected to follow the initial reading of the press release with questions for the researchers involved -- not attack the limited text that we provide as an introduction.
In my eleven years at NSF, I have never had someone attack my work -- particularly without first getting their facts straight.
Please contact the researchers to discuss the technology and limit your criticism for those thongs for which you are informed.
Media Officer for Engineering
National Science Foundation
(To add, my supervisor pointed out a stellar typo in my last line.
I'm fear that's where the discussion will go next, but if you do wish to learn more about the actual research you are disparaging, please do contact the researchers to learn more about the technology and the approach.)
Several regular readers have already responded in the comments section to that earlier post, making the point that experienced drug discovery scientists found the language in the press release hard to believe (and reminiscent of overhyped work from the past). Josh Chamot's response is reproduced here:
Thank you for the thoughtful responses. This is exactly the engagement I was hoping for.
First, I agree that hype is never what we want to communicate -- and I appreciate that skepticism is critical to ensuring accuracy and the complete communication of news. However, I do hope many of you will explore the research further so that any skepticism is completely informed.
I want to be clear that I have no intention of misleading the research or pharma communities, nor do I want to give false hope to those who might need any of the treatments that we referenced. Our language was intended to convey that the breakthrough to date is exciting, but clearly more work is needed before this can start producing drugs for patients -- and I believe we stated this.
Through links to additional information (such as the full patent application) and clear contact information for the principal investigator, it is our hope that the primary audience for the press release (reporters) will present a thorough and complete account of the work.
We do not wish to mislead, but we also cannot convey a full news story in press release format. The intent is to serve as an alert, and importantly, an accurate one.
Journalists are the primary audience for the press releases, and our system of information is reliant on their services. To the best of my knowledge, the information we presented on Parabon is accurate and states only results that Parabon has demonstrated and announced in their patent application -- the starting point for a journalist to explore the story further.
As background, the pieces I work on cover research efforts that are originally proposed to NSF in a review process informed by peers in the community. Parabon has received both Phase I and Phase II NSF small business funding, so they had succeeded in that competitive peer review twice.
That setting served as a baseline to inform my office that the research approach was a valid starting point -- however, as with almost all NSF research, this is research at the very earliest stages. I can accept that while I wrote the release to reflect this, I was not successful in conveying this clearly. However, the assertions that data in support of the research effort do not exist are incorrect.
The company first came to our office (public affairs) more than two years ago, and it is only now that the company had enough publicly available information for us to pull together an announcement of the technology and some introduction of how it works.
I have some lessons learned here in how to try to clarify caveats, but I stand by my original assertion that the research is valid and exciting. While I have no way to predict Parabon's ultimate success, I do believe that public discussion of their technique can only prove of value to the broader drug development effort -- including the identification of any obstacles that this, or a similar technique, must overcome.
I think what I'll do now is close off the comments to the previous post and have things move over to this entry, with appropriate pointers, so we don't have two discussion going on at the same time. Now, then. I'm not blaming Mr. Chamot for what went out on the wires, because I strongly suspect that he worked with what he was given. It's the people at Parabon that I'd really like to have a word with. If the press release is an accurate reflection of what they wanted to announce, then we have a problem, and it's not with Jack Chamot.
I realize that a press release is, in theory, supposed to be for the press - for reporters to use as a starting point for a real story. But how many of them do that, versus just rewording the release a bit? There are reporters who could pick up on all the problems, but there are many others who might not. The information in the Parabon release, as it stands, makes little sense to those of us who do drug discovery for a living, seems full of overstated claims, and raises many more questions than it answers. Specialists in the field (as many readers here are) will have an immediate and strong reaction to this sort of thing.
And that's one of the purposes of this blog (and of many others): to bring expertise out into the open, to provide people within some specialized area a chance to talk with each other, and to provide people outside it (anyone at all) a chance to sit in and learn about things they otherwise might never hear discussed. I think that the process that Mr. Chamot has described is an older one: scientists describe a discovery of theirs to some sort of press officer, who puts into some useful and coherent form in order to get the word out to reporters, who then can contact the people involved for more details as they write up their stories for a general readership. That's fine, but these days that whole multistep procedure is subject to disintermediation. And that's what we're seeing right now.
+ TrackBacks (0) | Category: Chemical Biology | Press Coverage
December 18, 2012
I'm having a real problem understanding this press release from the NSF. I've been looking at it for a few days now (it's been sent to me a couple of times in e-mail), and I still can't get a handle on it. And I'm not the only one. I see just this morning that Chemobber is having the same problem. Here, try some. See how you do:
Using a simple "drag-and-drop" computer interface and DNA self-assembly techniques, researchers have developed a new approach for drug development that could drastically reduce the time required to create and test medications. . ."We can now 'print,' molecule by molecule, exactly the compound that we want," says Steven Armentrout, the principal investigator on the NSF grants and co-developer of Parabon's technology. "What differentiates our nanotechnology from others is our ability to rapidly, and precisely, specify the placement of every atom in a compound that we design."
Say what? Surely they don't mean what it sounds like they mean. But they apparently do:
"When designing a therapeutic compound, we combine knowledge of the cell receptors we are targeting or biological pathways we are trying to affect with an understanding of the linking chemistry that defines what is possible to assemble," says Hong Zhong, senior research scientist at Parabon and a collaborator on the grants. "It's a deliberate and methodical engineering process, which is quite different from most other drug development approaches in use today."
OK, enough. I'd love for atom-by-atom nanotech organic synthesis and precisely targeted drug discovery to be a reality, but they aren't. Not yet. The patent application referenced in the press release is a bit more grounded in reality, but not all that much more:
The present invention provides nanostructures that are particularly well suited for delivery of bioactive agents to organs, tissues, and cells of interest in vivo, and for diagnostic purposes. In exemplary embodiments, the nanostructures are complexes of DNA strands having fully defined nucleotide sequences that hybridize to each other in such a way as to provide a pre-designed three dimensional structure with binding sites for targeting molecules and bioactive agents. The nanostructures are of a pre-designed finite length and have a pre-defined three dimensional structure
Ah, and these complexes of DNA strands will survive after in vivo dosing just exactly how? And will be targeted, via that precisely defined structure, just how? And bind to what, exactly, and with what sort of affinities? And are the binding sites on these DNA thingies, or do they bind to other things, anyway? No, this is a mess. And this press release is an irresponsible mishmosh of hype. I'd be glad to hear about some real results with some real new technology, and I'd like to ask the Parabon people to cough some up. I'd be equally glad to feature them on this blog if they can do so, but not if they're going to start talking like they're from the future and come to save us all. Sheesh.
Update: the discussion on this press release features a number of interesting comments. It's now moved over to this post, for reasons explained there. Thanks!
+ TrackBacks (0) | Category: Chemical Biology | Press Coverage
December 17, 2012
I wrote here about "stapled peptides", which are small modified helical proteins. They've had their helices stabilized by good ol' organic synthesis, with artificial molecular bridging between the loops. There are several ways to do this, but they all seem to be directed towards the same end.
That end is something that acts like the original protein at its binding site, but acts more like a small molecule in absorption, metabolism, and distribution. Bridging those two worlds is a very worthwhile goal indeed. We know of hordes of useful proteins, ranging from small hormones to large growth factors, that would be useful drugs if we could dose them without their being cleared quickly (or not making it into the bloodstream in the first place). Oral dosing is the hardest thing to arrange. The gut is a very hostile place for proteins - there's a lot of very highly developed machinery in there devoted to ripping everything apart. Your intestines will not distinguish the live-saving protein ligand you just took from the protein in a burrito, and will act accordingly. And even if you give things intravenously, as is done with the protein drugs that have actually made it to clinical use (insulin, EPO, etc.), getting their half-lives up to standard can be a real challenge.
So the field of chemically modified peptides and proteins is a big one, because the stakes are high. Finding small molecules that modulate protein-protein interactions is quite painful; if we could just skip that part, we'd be having a better time of it in this industry. There's an entire company (Aileron, just down the road from me) working on this idea, and many others besides. So, how's it going?
Well, this new paper will cause you to wonder about that. It's from groups in Australia and at Genentech, (Note: edited for proper credit here) and they get right down to it in the first paragraph:
Stabilized helical peptides are designed to mimic an α-helical structure through a constraint imposed by covalently linking two residues on the same helical face (e.g., residue i with i + 4). “Stapling” the peptide into a preformed helix might be expected to lower the energy barrier for binding by reducing entropic costs, with a concomitant increase in binding affinity. Additionally, stabilizing the peptide may reduce degradation by proteases and, in the case of hydrocarbon linkages, reportedly enhance transport into cells, thereby improving bioavailability and their potential as therapeutic agents. The findings we present here for the stapled BH3 peptide (BimSAHB), however, do not support these claims, particularly in regards to affinity and cell permeability.
They go on to detail their lack of cellular assay success with the reported stapled peptide, and suggest that this is due to lack of cell permeability. And since the non-stapled peptide control was just as effective on artificially permeabilized cells, they did more studies to try to figure out what the point of the whole business is. A detailed binding study showed that the stapled peptide had lower affinity for its targets, with slower on-rates and faster off-rates. X-ray crystallography suggested that the modifying the peptide disrupted several important interactions.
Update: After reading the comments so far, I want to emphasize that this paper, as far as I can see, is using the exact same stapled peptide as was used in the previous work. So this isn't just a case of a new system behaving differently; this seems to be the same system not behaving the way that it was reported to.
The entire "staple a peptide to make it a better version of itself" idea comes in for some criticism, too:
Our findings recapitulate earlier observations that stapling of peptides to enforce helicity does not necessarily impart enhanced binding affinity for target proteins and support the notion that interactions between the staple and target protein may be required for high affinity interactions in some circumstances.19 Thus, the design of stapled peptides should consider how the staple might interact with both the target and the rest of the peptide, and particularly in the latter case whether its introduction might disrupt otherwise stabilizing interactions.
That would be more in line with my own intuition, for what it's worth, which is that making such changes to a peptide helix would turn it into another molecule entirely, rather than (necessarily) making it into an enhanced version of what it was before. Unfortunately, at least in this case, this new molecule doesn't seem to have any advantages over the original, at least in the hands of the Genentech group. This is, as they say, very much in contrast to the earlier reports. How to resolve the discrepancies? And how to factor in that Roche has a deal with Aileron for stapled-peptide technology, and this very article is (partly) from Genentech, now a part of Roche? A great deal of dust has just been stirred up; watching it settle will be interesting. . .
+ TrackBacks (0) | Category: Cancer | Chemical Biology | Pharmacokinetics
November 26, 2012
I don't know how many readers have been following this, but there's been some interesting work over the last few years in using streptavidin (a protein that's an old friend of chemical biologists everywhere) as a platform for new catalyst systems. This paper in Science (from groups at Basel and Colorado State) has some new results in the area, along with a good set of leading references. (One of the authors has also published an overview in Accounts of Chemical Research). Interestingly, this whole idea seems to trace back to a George Whitesides paper from back in 1978, if you can believe that.
(Strept)avidin has an extremely well-characterized binding site, and its very tight interaction with biotin has been used as a set of molecular duct tape in more experiments than anyone can count. Whitesides realized back during the Carter administration that the site was large enough to accommodate a metal catalyst center, and this latest paper is the latest in a string of refinements of that idea, this time using a rhodium-catalyzed C-H activation reaction.
A biotinylated version of the catalyst did indeed bind streptavidin, but this system showed very low activity. It's known, though, that the reaction needs a base to work, so the next step was to engineer a weakly basic residue nearby in the protein. A glutamate sped things up, and an aspartate even more (with the closely related asparagine showing up just as poorly as the original system, which suggests that the carboxylate really is doing the job). A lysine/glutamate double mutant gave even better results.
The authors then fine-tuned that system for enantioselectivity, mutating other residues nearby. Introducing aromatic groups increased both the yield and the selectivity, as it turned out, and the eventual winner was run across a range of substrates. These varied quite a bit, with some combinations showing very good yields and pretty impressive enantioselectivities for this reaction, which has never until now been performed asymmetrically, but others not performing as well.
And that's promise (and the difficulty) with enzyme systems. Working on that scale, you're really bumping up against individual parts of your substrates on an atomic level, so results tend, as you push them, to bin into Wonderful and Terrible. An enzymatic reaction that delivers great results across a huge range of substrates is nearly a contradiction in terms; the great results come when everything fits just so. (Thus the Codexis-style enzyme optimization efforts). There's still a lot of brute force involved in this sort of work, which makes techniques to speed up the brutal parts very worthwhile. As this paper shows, there's still no substitute for Just Trying Things Out. The structure can give you valuable clues about where to do that empirical work (otherwise the possibilities are nearly endless), but at some point, you have to let the system tell you what's going on, rather than the other way around.
+ TrackBacks (0) | Category: Chemical Biology | Chemical News
November 15, 2012
I like to highlight phenotypic screening efforts here sometimes, because there's evidence that they can lead to drugs at a higher-than-usual rate. And who couldn't use some of that? Here's a new example from a team at the Broad Institute.
They're looking at the very popular idea of "cancer stem cells" (CSCs), a population of cells in some tumors that appear to be disproportionately resistant to current therapies (and disproportionately responsible for tumor relapse and regrowth). This screen uses a surrogate breast cell line, with E-cadherin knocked down, which seems to give the dedifferentiated phenotype you'd want to target. That's a bit risky, using an artificial system like that, but as the authors correctly point out, isolating a pure population of the real CSCs is difficult-to-impossible, and they're very poorly behaved in cell culture. So until those problems are solved, you have your choice - work on something that might translate over to the real system, or ditch the screening idea for now entirely. I think the first is worth a shot, as long as its limitations are kept in mind.
This paper does go on to do something very important, though - they use an isogenic cell line as a counterscreen, very close to the target cells. If you find compounds that hit the targets but not these controls, you have a lot more confidence that you're getting at some difference that's tied to the loss of E-cadherin. Using some other cell line as a control leaves too many doors open too wide; you could see "confirmed hits" that are taking advantage of totally irrelevant differences between the cell lines instead.
They ran a library of about 300,000 compounds (the MLSMR collection) past the CSC model cells, and about 3200 had the desired toxic effect on them. At this point, the team removed the compounds that were flagged in PubChem as toxic to normal mammalian cell lines, and also removed compounds that had hit in more than 10% of the assays they'd been through, both of which I'd say are prudent moves. Retesting the remaining 2200 compounds gave a weird result: at the highest concentration (20 micromolar), 97 per cent of them were active. I probably would have gotten nervous at that point, wondering if something had gone haywire with the assay, and I'll bet that a few folks at the Broad felt the same way.
But when used the isogenic cell line, things narrowed down rather quickly. Only 26 compounds showed reasonable potency on the target cells along with at least a 25-fold window for toxicity to the isogenic cells. (Without that screen, then, you'd have been chasing an awful lot of junk). Then they ordered up fresh samples of these, which is another step that believe me, you don't want to neglect. A number of compounds appear to have not been quite what they were supposed to be (not an uncommon problem in a big screening collection; you trust the labels unconditionally at your own peril).
In the end, two acylhydrazone compounds ended up retaining their selectivity after rechecking. So you can see how things narrow down in these situations: 300K to 2K to 26 to 2, and that's not such an unusual progression at all. The team made a series of analogs around the lead chemical matter, and then settled on the acylhydrazone compound shown (ML239) as the best in show. It's not a beauty. There seems to be some rule that more rigorous and unusual a phenotypic screen, the uglier the compounds that emerge from it. I'm only half kidding, or maybe a bit less - there are some issues to think about in there, and that topic is worth a post of its own.
More specifically, the obvious concern in that fulvene-looking pyrrole thingie on the right (I use "thingie" in its strict technical sense here). That's not a happy-looking (that is, particularly stable-looking) group. The acylhydrazine part might raise eyebrows with some people, but Rimonabant (among other compounds) shows that that functional group can be part of a drug. Admittedly, Rimonabant went down with all hands, but it wasn't because of the acylhydrazine. And the trichloroaryl group isn't anyone's favorite, either, but in this context, it's just sort of a dessert topping, in an inverse sense.
But the compound appears to be the real thing, as a pharmacological tool. It was also toxic to another type of breast cancer cell that had had its E-cadherin disrupted, and to a further nonengineered breast cancer cell line. Now comes the question: how does this happen? Gene expression profiling showed a variety of significant changes, with all sorts of cell death and free radical scavenging things altered. By contrast, when they did the same profiling on the isogenic controls, only five genes were altered to any significant extent, and none of those overlapped with the target cells. This is very strong evidence that something specific and important is being targeted here. A closer analysis of all the genes suggests the NF-kappaB system, and within that, perhaps a protein called TRIB3. Further experiments will have to be done to nail that down, but it's a good start. (And yes, in case you were wondering, TRIB3 does, in fact, stand for "tribble-3", and yes, that name did originate with the Drosophila research community, and how did you ever guess?)
So overall, I'd say that this is a very solid example of how phenotypic screening is supposed to work. I recommend it to people who are interested in the topic - and to people who aren't, either, because hey, you never know when it might come in handy. This is how a lot of new biology gets found, through identifying useful chemical matter, and we can never have too much of it.
+ TrackBacks (0) | Category: Cancer | Chemical Biology | Drug Assays
November 8, 2012
We're getting closer to real-time X-ray structures of protein function, and I think I speak for a lot of chemists and biologists when I say that this has been a longstanding dream. X-ray structures, when they work well, can give you atomic-level structural data, but they've been limited to static time scales. In the old, old days, structures of small molecules were a lot of work, and structure of a protein took years of hard labor and was obvious Nobel Prize material. As time went on, brighter X-ray sources and much better detectors sped things up (since a lot of the X-rays deflected from a large compound are of very low intensity), and computing power came along to crunch through the piles of data thus generated. These days, x-ray structures are generated for systems of huge complexity and importance. Working at that level is no stroll through the garden, but more tractable protein structures are generated almost routinely (although growing good protein crystals is still something of a dark art, and is accomplished through what can accurately be called enlightened brute force).
But even with synchrotron X-ray sources blasting your crystals, you're still getting a static picture. And proteins are not static objects; the whole point of them is how they move (and for enzymes, how they get other molecules to move in their active sites). I've heard Barry Sharpless quoted to the effect that understanding an enzyme by studying its X-ray structures is like trying to get to know a person by visiting their corpse. I haven't heard him say that (although it sounds like him!), but whoever said it was correct.
Comes now this paper in PNAS, a multinational effort with the latest on the attempts to change that situation. The team is looking at photoactive yellow protein (PYP), a blue-light receptor protein from a purple sulfur bacterium. Those guys vigorously swim away from blue light, which they find harmful, and this seems to be the receptor that alerts them to its presence. And the inner workings of the protein are known, to some extent. There's a p-courmaric acid in there, bound to a Cys residue, and when blue light hits it, the double bond switches from trans to cis. The resulting conformational change is the signaling event.
But while knowing things at that level is fine (and took no small amount of work), there are still a lot of questions left unanswered. The actual isomerization is a single-photon event and happens in a picosecond or two. But the protein changes that happen after that, well, those are a mess. A lot of work has gone into trying to unravel what moves where, and when, and how that translates into a cellular signal. And although this is a mere purple sulfur bacterium (What's so mere? They've been on this planet a lot longer than we have), these questions are exactly the ones that get asked about protein conformational signaling all through living systems. The rods and cones in your eyes are doing something very similar as you read this blog post, as are the neurotransmitter receptors in your optic nerves, and so on.
This technique, variations of which have been coming on for some years now, uses multiple wavelengths of X-rays simultaneously, and scans them across large protein crystals. Adjusting the timing of the X-ray pulse compared to the light pulse that sets off the protein motion gives you time-resolved spectra - that is, if you have extremely good equipment, world-class technique, and vast amounts of patience. (For one thing, this has to be done over and over again from many different angles).
And here's what's happening: first off, the cis structure is quite weird. The carbonyl is 90 degrees out of the plane, making (among other things) a very transient hydrogen bond with a backbone nitrogen. Several dihedral angles have to be distorted to accommodate this, and it's a testament to the weirdness of protein active sites that it exists at all. It then twangs back to a planar conformation, but at the cost of breaking another hydrogen bond back at the phenolate end of things. That leaves another kind of strain in the system, which is relieved by a shift to yet another intermediate structure through a dihedral rotation, and that one in turn goes through a truly messy transition to a blue-shifted intermediate. That involves four hydrogen bonds and a 180-degree rotation in a dihedral angle, and seems to be the weak link in the whole process - about half the transitions fail and flop back to the ground state at that point. That also lets a crucial water molecule into the mix, which sets up the transition to the actual signaling state of the protein.
If you want more details, the paper is open-access, and includes movie files of these transitions and much more detail on what's going on. What we're seeing is light energy being converted (and channeled) into structural strain energy. I find this sort of thing fascinating, and I hope that the technique can be extended in the way the authors describe:
The time-resolved methodol- ogy developed for this study of PYP is, in principle, applicable to any other crystallizable protein whose function can be directly or indirectly triggered with a pulse of light. Indeed, it may prove possible to extend this capability to the study of enzymes, and literally watch an enzyme as it functions in real time with near- atomic spatial resolution. By capturing the structure and temporal evolution of key reaction intermediates, picosecond time-resolved Laue crystallography can provide an unprecedented view into the relations between protein structure, dynamics, and function. Such detailed information is crucial to properly assess the validity of theoretical and computational approaches in biophysics. By com- bining incisive experiments and theory, we move closer to resolving reaction pathways that are at the heart of biological functions.
Speed the day. That's the sort of thing we chemists need to really understand what's going on at the molecular level, and to start making our own enzymes to do things that Nature never dreamed of.
+ TrackBacks (0) | Category: Analytical Chemistry | Biological News | Chemical Biology | Chemical News
October 30, 2012
The Atlantic is out with a list of "Brave Thinkers", and one of them is Jay Bradner at Harvard Medical School. He's on there for JQ1, a small-molecule bromodomain ligand that was reported in 2010. (I note, in passing, that once again nomenclature has come to the opposite of our rescue, since bromodomains have absolutely nothing to do with bromine, in contrast to 98% of all the other words that begin with "bromo-")
These sorts of compounds have been very much in the news recently, as part of the whole multiyear surge in epigenetic research. Drug companies, naturally, are looking to the epigenetic targets that might be amenable to small-molecule intervention, and bromodomains seem to qualify (well, some of them do, anyway).
At any rate, JQ1 is a perfectly reasonable probe compound for bromodomain studies, but it got a lot of press a couple of months ago as a potential male contraceptive. I found all that wildly premature - a compound like this one surely sets off all kinds of effects in vivo, and disruption of spermatogenesis is only one of them. Note (PDF) that it hits a variety of bromodomain subtypes, and we only have the foggiest notion of what most of these are doing in real living systems.
The Atlantic, for its part, makes much of Bradner's publishing JQ1 instead of patenting it:
The monopoly on developing the molecule that Bradner walked away from would likely have been worth a fortune (last year, the median value for U.S.-based biotech companies was $370 million). Now four companies are building on his discovery—which delights Bradner, who this year released four new molecules. “For years, drug discovery has been a dark art performed behind closed doors with the shades pulled,” he says. “I would be greatly satisfied if the example of this research contributed to a change in the culture of drug discovery.”
But as Chemjobber rightly says, the idea that Bradner walked away from a fortune is ridiculous. JQ1 is not a drug, nor is it ever likely to become a drug. It has inspired research programs to find drugs, but they likely won't look much (or anything) like JQ1, and they'll do different things (for one, they'll almost surely be more selective). In fact, chasing after that sort of selectivity is one of the things that Bradner's own research group appears to be doing - and quite rightly - while his employer (Dana-Farber) is filing patent applications on JQ1 derivatives. Quite rightly.
Patents work differently in small-molecule drug research than most people seem to think. (You can argue, in fact, that it's one of the areas where the system works most like it was designed to, as opposed to often-abominable patent efforts in software, interface design, business methods, and the like). People who've never had to work with them have ideas about patents being dark, hidden boxes of secrets, but one of the key things about a patent is disclosure. You have to tell people what your invention is, what it's good for, and how to replicate it, or you don't have a valid patent.
Admittedly, there are patent applications that do not make all of these steps easy - a case in point would be the ones from Exelixis - I wrote here about my onetime attempts to figure out the structures of some of their lead compounds from their patent filings. Not long ago I had a chance to speak with someone who was there at the time, and he was happy to hear that I'd come up short, saying that this had been exactly the plan). But at the same time, all their molecules were in there, along with all the details of how to make them. And the claims of the patents detailed exactly why they were interested in such compounds, and what they planned to do with them as drugs. You could learn a lot about what Exelixis was up to; it was just that finding out the exact structure of the clinical candidate that was tricky. A patent application on JQ1 would have actually ended up disclosing most (or all) of what the publication did.
I'm not criticizing Prof. Bradner and his research group here. He's been doing excellent work in this area, and his papers are a pleasure to read. But the idea that Harvard Medical School and Dana-Farber would walk away from a pharma fortune is laughable.
+ TrackBacks (0) | Category: Cancer | Chemical Biology | Drug Development | Patents and IP
October 17, 2012
Zafgen is a startup in the Boston area that's working on a novel weight-loss drug called beloranib. Their initial idea was that they were inhibiting angiogenesis in adipose tissue, through inhibition of methionine aminopeptidase-2. But closer study showed that while the compound was indeed causing significant weight loss in animal models, it wasn't through that mechanism. Blood vessel formation wasn't affected, but the current thinking is that Met-AP2 inhibition is affecting fatty acid synthesis and causing more usage of lipid stores.
But when they say "novel", they do mean it. Behold one of the more unlikely-looking drugs to make it through Phase I:
Natural-product experts in the audience might experience a flash of recognition. That's a derivative of fumagillin, a compound from Aspergillus that's been kicking around for many years now. And its structure brings up a larger point about reactive groups in drug molecules, the kind that form covalent bonds with their targets.
I wrote about covalent drugs here a few years ago, and the entire concept has been making a comeback. (If anyone was unsure about that, Celgene's purchase of Avila was the convincer). Those links address the usual pros and cons of the idea: on the plus side, slow off rates are often beneficial in drug mechanisms, and you don't get much slower than covalency. On the minus side, you have to worry about selectivity even more, since you really don't want to go labeling across the living proteome. You have the mechanisms of the off-target proteins to worry about once you shut them down, and you also have the ever-present fear of setting off an immune response if the tagged protein ends up looking sufficiently alien.
I'm not aware of any published mechanistic studies of beloranib, but it is surely another one of this class, with those epoxides. (Looks like it's thought to go after a histidine residue, by analogy to fumagillin's activity against the same enzyme). But here's another thing to take in: epoxides are not as bad as most people think they are. We organic chemists see them and think that they're just vibrating with reactivity, but as electrophiles, they're not as hot as they look.
That's been demonstrated by several papers from the Cravatt labs at Scripps. (He still is at Scripps, right? You need a scorecard these days). In this work, they showed that some simple epoxides, when exposed to entire proteomes, really didn't label many targets at all compared to the other electrophiles on their list. And here, in an earlier paper, they looked at fumagillin-inspired spiroexpoxide probes specifically, and found an inhibitor of phosphoglycerate mutase 1. But a follow-up SAR study of that structure showed that it was very picky indeed - you had to have everything lined up right for the epoxide to react, and very close analogs had no effect. Taken together, the strong implication is that epoxides can be quite selective, and thus can be drugs. You still want to be careful, because the toxicology literature is still rather vocal on the subject, but if you're in the less reactive/more structurally complex/more selective part of that compound space, you might be OK. We'll see if Zafgen is.
+ TrackBacks (0) | Category: Chemical Biology | Diabetes and Obesity | Drug Development
September 28, 2012
This evening's EMBL speaker is Paul Workman on new cancer targets and drug development. He's pointed out that treating cancer (and classifying cancer) by where it's located in the body is actually fairly primitive. Tumor cells in, say, breast cancer surely have more in common with various other type of tumor cells than they do with the normal cells surrounding them.
He claims that we're starting to see attrition rates come down in oncology, and I hope he's right. I see, though, that he's reified the "Valley of Death", which I'm not so sure about. There surely are some ideas in academia that should be moved along to development, but not all of them are worthy. (That's no slur - not all the targets inside the drug companies are worthy either, believe me). I worry that constant referral to a Valley of Death makes it sound as if there's something mysterious going on, when it really doesn't seem that strange to me. This Valley is mostly a gap between what works and what doesn't, rather than between academia and industry.
He also has a good slide on probe compounds versus drugs (here are the details). Probes, he says, need to meet even more stringent criteria for selectivity and potency than drugs do if their purpose is going to be to uncover new biology. Selectivity is usually the hardest barrier. That said, probes have to evolve. You don't find compounds like this right out of an HTS screen, and they're going to need some cycles of med-chem before they're truly ready for use. A less-than-optimal probe shouldn't be seen as a failure, but as an intermediate step.
+ TrackBacks (0) | Category: Chemical Biology
John Overington from the EMBL is talking about the ChEMBL database, which is an impressive collection. One thing that I appreciate is that he's being upfront about the error rates in the data. He takes the reports of trouble seriously, but feels (overall) that considering the amount of data they have, and the amount of annotation associated with it, that they've done well.
There are an awful lot of ways that you can work the numbers from their web site, which is both good and bad. If you know what you're doing, you can get some very interesting and potentially useful results, but if you don't, you can mislead yourself more quickly and thoroughly than you ever could by hand. That's common to all powerful tools, naturally.
My talk is right after the next speaker, so I won't be posting for a bit. And no, I will not be writing a critique of my own talk while I'm giving it; that would be a Blog Singularity of some sort.
+ TrackBacks (0) | Category: Chemical Biology
September 27, 2012
Now the conference day is winding up with a big talk by George Whitesides. He's talking about his thoughts on enzyme function, with reference to his group's work using carbonic anhydrase as a model. He praises its stability ("a ceramic brick") and other characteristics, as you might expect from someone who's published an entire review on its use in biophysical studies.
So what makes compounds bind to enzyme sites? His take on the hydrophobic effect is that he thinks it's due as much (or more) to changes in networks of water molecules, rather than just the release of structured water at the protein-ligand contact. The latter is important, for sure, but not the whole story. "There is no one hydrophobic effect", he says, "there are many hydrophobic effects".
Another quote: "There ain't nothin' like water", and I definitely agree. We're used to water, since it's the most common chemical substance that we deal with in our lives, but water is weird.
And there's a lot we don't know about it still. For example, Whitesides has just pointed out that we have a reasonable understanding of surface tension in the bulk phase - but not at all for molecular-sized holes. This is crucial for understanding ligand behavior. His view of protein-ligand binding, he says, is very water-centric. . .
+ TrackBacks (0) | Category: Chemical Biology
Just to emphasize how careful you have to be with all these probes and labels, consider what I'm hearing now from Remigiusz Serwa of the Tate group at Imperial College. His group is looking at farnesylation. People have tried making azido-containing substrates, for later "click" fluorescent labeling of proteins that pick up the label, but the azido group turns out to be a loser here. It's too polar in the greasy world of prenyl groups, and things go haywire.
You'd think that switching the click reaction around would be the answer here - make an alkyne group to be picked up by farnesyltransferase and you're in. But the ones that have been tried so far are terrible substrates for the enzymes. He seems to be on the way to solving that problem, but (interestingly) isn't revealing the structure (yet) of his probe. Must be a manuscript on the way - probably with a patent on the way before that?
+ TrackBacks (0) | Category: Chemical Biology
The latest talk is from Alanna Schepartz of Yale. I had a chance to ride in from the airport with her yesterday, and she gave me a brief preview of her talk, which is on transport of both molecules and information through the plasma membrane of cells. "Some molecules weren't paying attention when Lipinski's rules came down", she says (Lipinski himself was supposed to be here, but had to cancel at the last minute, BTW).
The example here is the EGF receptor. We know a fair amount about the extracellular domain of this protein, and some about the intracellular part. But the "juxtamembrane" portion connecting the two is more of a mystery, although it's clearly crucial for receptor signaling. Her lab has been using a fluorescent marker for particular protein coil structures. What this work seems to show is that different ligands for EGFR (EGF versus TGF-alpha), which are known to produce different downstream signaling, do so through different structures of the protein. Subtle variations of the coiled-coil helical protein on the intracellular face are meaningful and provide yet another way for these receptors to vary their function.
You'd think that there would have to be some such structural difference, since the two "agonists" do act differently. But actually getting a look at it in action is something else again. This is, to me, another example of "treat the protein as a big molecule" thinking. People who do structure-based drug discovery are used to that viewpoint, but not all molecular and cell biologists are. They'll find chemistry infiltrating their worldview, is my prediction. . .
+ TrackBacks (0) | Category: Chemical Biology
Now I'm listening to David Tirrell of Cal Tech, talking about his lab's work on labeling proteins with azidohomoalanine (Aha) as a marker. He's done a good job showing that (if you don't go wild) that replacement of methionine with this amino acid doesn't perturb things very much at all, and there's a recent paper showing how well the technique works (when combined with stable isotope labeling) for analyzing mixtures of low-abundance proteins. You can now buy all the reagents you need to do this.
The Aha can be activated by wild-type Met tRNA synthetase (MetRS), but he's also working with weirder amino acids that require a mutant RS enzyme. This is useful for even finer-grained experiments; the example shown is for monitoring host-pathogen interactions. Using a Yersinia species, he's showing all sorts of complex results, most of which fall into the category of "Must be important, but we don't know what they mean yet". The bacteria inject a number of as-yet-uncharacterized proteins into mammalian cells, for example, and without techniques like these, you'd never find them.
They've gone as far as doing this in whole living nematodes - it looks like this has been disclosed at meetings, but there doesn't appear to be a full paper on this yet.
A nice quote from the talk: "We did a computational search, which didn't help us out very much, but the experiment was great". Words to live by!
+ TrackBacks (0) | Category: Chemical Biology
Right now, there's a talk going on from Helma Wennemers of the ETH. She's working on small peptidic catalysts for organic reactions, what one might think of as "mini-enzymes". They're certainly not as wildly effective as real enzymes, but they're a lot easier to find and modify. Here's an example, which has been extended to solid-supported catalysts here. And whenever I see a solid-supported catalyst, I think "Can you use that for flow chemistry?" I was glad to see that they're done just that - I don't think that work has been published yet, but it seems to work pretty well.
Chemistry like this is a good reminder of just how many catalysts remain to be found. I don't see any reason, a priori, for any reaction to be out of bounds for enzymatic-type catalysis. You have functional groups that can participate in some reaction mechanisms (as is the case for the proline nitrogen in the above work), you have stabilization of transition states, you have sheer physical proximity/effective molarity, and probably other effects that people are still arguing about. Eventually we'll get good enough to design such things, but for now, a combination of design and what I might call "enlightened brute force" looks like the way to go. I'd like to see someone pick some reaction types that are not catalyzed enzymatically and apply these techniques to make something we've never seen before. If we could figure out how to get new metallic centers into these this things (imagine an enzymatic palladium catalyst!), we could really do some wild chemistry. Mind you, I'm not the one who would be trying to get that funded.
+ TrackBacks (0) | Category: Chemical Biology
Jason Chin of the MRC Molecular Biology lab in the UK has been talking here about protein labeling and genetic code expansion, an overview of the numerous papers his group has been publishing in this area over the last few years.
And he's just made what I think is a very worthwhile point. While talking about labeling proteins with very reactive alkyne-containing amino acids (for fluorescent "click" applications), he said that some people would look at this and say "Why bother - you can already label these things with GFP". But sticking an entire Green Fluorescent Protein onto an existing one is hardly a silent event. If you're going to think about these things the way a chemist would, you need to come in with something as small and unobtrusive as possible. And it also needs to be something that you can localize, which doesn't just mean "I know what protein it's on".
Chemists think - or had better think - at a higher magnification. What exact surface of the protein is this label on? What residues are next to it? What sort of binding pockets might it be interrogating? We need to treat proteins as molecules, and as molecules they have a lot of detail in them.
+ TrackBacks (0) | Category: Chemical Biology
September 26, 2012
Chris Walsh of Harvard is talking about the trithiazolylpeptide antibiotics and related compounds. If you thought that only we synthetic organic chemists were crazy enough to link three more heterocycles onto a central pyridine, leading to compounds which "have the solubility of sand" (a direct quote from Walsh), then think again. And they weren't even made by palladium-catalyzed couplings! Since we were talking about macrocycles here the other day, it's worth noting that these are also 29-membered rings and the like.
Here's one of them for you, if you haven't seen these beasts before. Who's synthesized it? Funny you should ask. . .
+ TrackBacks (0) | Category: Chemical Biology
A short talk from Steven Verhelst of Munich went into detail on some covalent probes for rhomboid proteases. I've been interested for a while about what happens when you run small electrophilic compounds over proteins - do they stick to everything, or can they show selectivity? The canonical paper on this topic is from the Cravatt group, which I'd recommend to anyone who finds this topic worthy. (Update: the Liebler group at Vanderbilt has also published some excellent work in this area, concentrating on Cys modification). Verhelst had one variety of electrophile that was selective in the active site, and another class that inhibited by sticking all over the place. So the answer is probably "Depends on your protein, and on your electrophile. Try it and see".
+ TrackBacks (0) | Category: Chemical Biology
Now I'm listening to Jim Wells (UCSF) talk about (among other things) this work, where they found a compound aggregating and causing activity in their assays. But this one wasn't doing the standard globular gunk that the usual aggregation gives you. Instead, the compound formed nanofibrils - microns long. And the enzyme that the compound showed activity against turns out to bind to the surface of the fribrils. Wells likens the effect to the way that Brussel sprouts grow, and his electron micrograph does indeed look pretty close. The question is, does this mimic something that happens "in real life", or is it a complete artifact? There's a paper in press in JBC going into some of the details. Just goes to show you that compounds are capable of doing things that you'd never have been able to guess.
+ TrackBacks (0) | Category: Chemical Biology
I'm listening to Paul Hergenrother (of Illinois) talk about using natural products as starting materials for compound screening libraries. It's a good idea - he takes readily available complex structures and does a range of organic chemistry on each of them, to make non-natural structures that have the complexity and functionality of natural products. I note that he's taken adrenosterone and made azasteroid derivatives (among many others), very similar to what I talked about here. He's also used quinine, gibbererlic acid, and others.
He's taken the collection thus produced and run them through phenotypic cell screens, with what look like interesting preliminary results. The idea is to look for unusual phenotypes and work backwards to new targets from them, so having a pile of unusual compounds is probably a good starting point. Of course, I have a weakness for phenotypic screens in general, and I suspect I'm going to be hearing a lot about them here over the next few days.
+ TrackBacks (0) | Category: Chemical Biology
August 21, 2012
This paper from GlaxoSmithKline uses a technology that I find very interesting, but it's one that I still have many questions about. It's applied in this case to ADAMTS-5, a metalloprotease enzyme, but I'm not going to talk about the target at all, but rather, the techniques used to screen it. The paper's acronym for it is ELT, Encoded Library Technology, but that "E" could just as well stand for "Enormous".
That's because they screened a four billion member library against the enzyme. That is many times the number of discrete chemical species that have been described in the entire scientific literature, in case you're wondering. This is done, as some of you may have already guessed, by DNA encoding. There's really no other way; no one has a multibillion-member library formatted in screening plates and ready to go.
So what's DNA encoding? What you do, roughly, is produce a combinatorial diversity set of compounds while they're attached to a length of DNA. Each synthetic step along the way is marked by adding another DNA sequence to the tag, so (in theory) every compound in the collection ends up with a unique oligonucleotide "bar code" attached to it. You screen this collection, narrow down on which compound (or compounds) are hits, and then use PCR and sequencing to figure out what their structures must have been.
As you can see, the only way this can work is through the magic of molecular biology. There are so many enzymatic methods for manipulating DNA sequences, and they work so well compared with standard organic chemistry, that ridiculously small amounts of DNA can be detected, amplified, sequenced, and worked with. And that's what lets you make a billion member library; none of the components can be present in very much quantity (!)
This particular library comes off of a 1,3,5-triazine, which is not exactly the most cutting-edge chemical scaffold out there (I well recall people making collections of such things back in about 1992). But here's where one of the big questions comes up: what if you have four billion of the things? What sort of low hit rate can you not overcome by that kind of brute force? My thought whenever I see these gigantic encoded libraries is that the whole field might as well be called "Return of Combichem: This Time It Works", and that's what I'd like to know: does it?
There are other questions. I've always wondered about the behavior of these tagged molecules in screening assays, since I picture the organic molecule itself as about the size of a window air conditioner poking out from the side of a two-story house of DNA. It seems strange to me that these beasts can interact with protein targets in ways that can be reliably reproduced once the huge wad of DNA is no longer present, but I've been assured by several people that this is indeed the case.
In this example, two particular lineages of compounds stood out as hits, which makes you much happier than a collection of random singletons. When the team prepared a selection of these as off-DNA "real organic compounds", many of them were indeed nanomolar hits, although a few dropped out. Interestingly, none of the compounds had the sorts of zinc-binding groups that you'd expect against the metalloprotease target. The rest of the paper is a more traditional SAR exploration of these, leading to what one has to infer are more tool/target validation compounds rather than drug candidates per se.
I know that GSK has been doing this sort of thing for a while, and from the looks of it, this work itself was done a while ago. For one thing, it's in J. Med. Chem., which is not where anything hot off the lab bench appears. For another, several of the authors of the paper appear with "Present Address" footnotes, so there has been time for a number of people on this project to have moved on completely. And that brings up the last set of questions, for now: has this been a worthwhile effort for GSK? Are they still doing it? Are we just seeing the tip of a large and interesting iceberg, or are we seeing the best that they've been able to do? That's the drug industry for you; you never know how many cards have been turned over, or why.
+ TrackBacks (0) | Category: Chemical Biology | Chemical News | Drug Assays | Drug Industry History
August 16, 2011
Here's another paper at the intersection of biology and chemistry: a way to check the activity of a huge number of mutated esterase enzymes, all at the same time.
Protein engineering is a hot field, as well it should be, since enzymes do things in ways that we lowly organic chemists can only envy. Instead of crudely bashing and beating on the molecules out in solution, an enzyme grabs each of them, one at a time, and breaks just the bond it wants to, in the direction it wants to do it, and then does it again and again. If you're looking for molecular-scale nanotechnology, there it is, and it's been right in front of us the whole time.
Problem is, enzymes get that way through billions of years of evolution and selection, and those selection pressures don't necessarily have anything to do with the industrial reactions we're thinking of these days. And since we don't have a billion years to wait, we have to speed things up. Thus the work at places like Codexis on engineered mutant enzymes, and thus a number of very interesting takes on directed evolution. (Well, interesting to me, at any rate - I have a pronounced weakness for this sort of thing).
This latest paper, from the University of Greifswald in Germany, builds on the work of Manfred Reetz at the Max-Planck Institute, who's been very influential in the field. Specifically, it follows up on the idea in this paper from his group and this one from the Quax group at Groningen in the Netherlands. That technique involved selecting for specificity in esterase enzymes by giving organisms a choice of two substrates: if they hydrolyze the right chiral starting material, the cleaved ester furnishes them with a nutrient. If they hydrolyze the wrong one, though, they produce a poison. Rather direct, but with bacteria there's no other way to get their attention - survival's really all they care much about.
And that technique worked, but it was a bit laborious. The largest number of different variations tested was about 2500, which seems like a lot until you do the math on protein mutations. It gets out of control very, very quickly when you have twenty variations per amino acid residue. Naturally, some of the residues shouldn't ever be touched, while others will have only minimal effects, and others are the hot spots you should be concentrating on. But which ones are which? And since you absolutely can't assume that they're all acting independently of each other, you have your work cut out for you. (Navigation through this thicket is what Codexis is selling, actually).
This latest paper adds flow cytometry, cell sorting, to the mix. Using dye systems and one of these machines to distinguish viable bacteria from dead or dying ones lets you take a culture and pull out only the survivors. When the authors expressed different esterases (with known preferences for the two substrates) in E. coli, they got the expected results - the ones with an enzyme that could cleave the nutrient-giving substrate grew, while the ones that unveiled the poison (2,3-dibromopropanol) halted in their tracks.
They then took another esterase with very modest selectivity and created a library of mutant variations - about ten million mutant variations - and expressed the whole shebang in a single liquid colony of E. coli. This was then exposed to the mixture of substrates, and anything that grew was pulled out by the cell sorter and plated out on agar (also containing the selection mixture of substrates). They got 28 clones to grow in the end, and characterized three of these more fully as purified enzymes. Of those, two of them were, in fact, much more selective than the starting enzyme (giving E values, enantiomeric ratios, of 80 to 100 as opposed to 3). Another, interestingly, was not selective at all.
And when you look at the sequences and the mutations that were picked out, you can see how tricky a business this is. One of the two selective enzymes had its valine-121 residues mutated to isoleucine (V128I) its phenylalanine-198 residue mutated to cysteine (F198C). The other was broadly similar, with one added mutation: that valine-121 was changed in this case to serine (V121S), the F198 was mutated to glycine (F198G), and also valine-225 was changed to alanine (V225A). Now, some of those aren't very big mutations (V to I, V to A), but what's even more interesting is the sequence of the unselective clone that they characterized: that one had V121I, F198G, V225A. So it had a mix of the exact mutations found in the two selective enzymes, but was itself a dud.
I'm glad to see that this worked, although you have to wonder how efficiently it moves in on a target when you get two decent hits out of ten million starting mutations. (The relative ease of screening goes a long way towards making up for that). But what I'd like to see is a mix of this technique with the one that I wrote about a few weeks ago, where a bacterium was evolved to use a chlorinated DNA base. That one used a particularly slick directed-evolution device, which would be quite interesting to apply to this food-versus-poison idea. You'd have to do some fine-tuning, especially at first, since the liberated poisonous substrate would be killing off the just and the unjust alike (which is the same problem that this current paper faced). But it seems like there should be a way to run things so that you're not just screening a big library of random mutations in the enzyme, but actually pushing the enzyme to evolve in the direction you want. Thoughts?
+ TrackBacks (0) | Category: Chemical Biology
July 6, 2011
There's been a real advance in the field of engineered "unnatural life", but it hasn't produced one-hundredth the headlines that the arsenic bacteria story did. This work is a lot more solid, although it's hard to summarize in a snappy way.
Everyone knows about the four bases of DNA (A, T, C, G). What this team has done is force bacteria to use a substitute for the T, thymine - 5-chlorouracil, which has a chlorine atom where thymine's methyl group is. From a med-chem perspective, that's a good switch. The two groups are about the same size, but they're different enough that the resulting compounds can have varying properties. And thymine is a good candidate for a swap, since it's not used in RNA, thus limiting the number of systems that have to change to accommodate the new base. (RNA, of course, uses uracil instead, the unsubstituted parent compound of both thymine and the 5-chloro derivative used here).
Over the years, chlorouracil has been studied in DNA for just that reason, and it's been found to make the proper base-pair hydrogen bonds, among other things. So incorporating it into living bacteria looks like an experiment in just the right spot - different enough to be a real challenge, but similar enough to be (probably) doable. People have taken a crack at similar experiments before, with mixed success. In the 1970s, mutant hamster cells were grown in the presence of the bromo analog, and apparently generated DNA which was strongly enriched with that unnatural base. But there were a number of other variables that complicated the experiment, and molecular biology techniques were in their infancy at the time. Then in 1992, a group tried replacing the thymine in E. coli with uracil, with multiple mutations that shut down the T-handling pathways. They got up to about 90% uracil in the DNA, but this stopped the bacteria from growing - they just seemed to be hanging on under those T-deprived conditions, but couldn't do much else. (In general, withholding thymine from bacterial cultures and other cells is a good way to kill them off).
This time, things were done in a more controlled manner. The feat was accomplished by good old evolutionary selection pressure, using an ingenious automated system. An E. coli strain was produced with several mutations in its thymine pathways to allow it to survive under near-thymine-starvation conditions. These bacteria were then grown in a chamber where their population density was being constantly measured (by turbidity). Every ten minutes a nutrient pulse went in: if the population density was above a set limit, the cells were given a fixed amount of chlorouracil solution to use. If the population had falled below a set level, the cells received a dose of thymine-containing solution to keep them alive. A key feature of the device was the use of two culture chambers, with the bacteria being periodically swapped from one to the other (which the first chamber undergoes sterilization with 5M sodium hydroxide!) That's to keep biofilm formation from giving the bacteria an escape route from the selection pressure, which is apparently just what they'll do, given the chance. One "culture machine" was set for a generation time of about two hours, and another for a 4-hour cycle (by cutting in half the nutrient amounts). This cycle selected for mutations that allowed the use of chlorouracil throughout the bacteria's biochemistry.
And that's what happened - the proportion of the chlorouracil solution that went in went up with time. The bacterial population had plenty of dramatic rises and dips, but the trend was clear. After 23 days, the experimenters cranked up the pressure - now the "rescue" solution was a lower concentration of thymine, mixed 1:1 with chlorouracil, and the other solution was a lower concentration of chlorouracil only. The proportion of the latter solution used still kept going up under these conditions as well. Both groups (the 2-hour cycle and the 4-hour cycle ones) were consuming only chlorouracil solution by the time the experiment went past 140 days or so.
Analysis of their DNA showed that it had incorporated about 90% chlorouracil in the place of thymine. The group identified a previously unknown pathway (U54 tRNA methyltransferase) that was bringing thymine back into the pathway, and disrupting this gene knocked the thymine content down to just above detection level (1.5%). Mass spec analysis of the DNA from these strains clearly showed the chlorouracil present in DNA fractions.
The resulting bacteria from each group, it turned out, could still grow on thymine, albeit with a lag time in their culture. If they were switched to thymine media and grown there, though, they could immediately make the transition back to growing on chlorouracil, which shows that their ability to do so was now coded in their genomes. (The re-thymined bacteria, by the way, could be assayed by mass spec as well for the disappearance of their chlorouracil).
These re-thymined bacteria were sequenced (since the chloruracil mutants wouldn't have matched up too well with sequencing technology!) and they showed over 1500 base substitutions. Interestingly, there were twice as many in the A-T to G-C direction as the opposite, which suggests that chlorouracil tends to mispair a bit with guanine. The four-hour-cycle strain had not only these sorts of base swaps, but also some whole chromosome rearrangements. As the authors put it, and boy are they right, "It would have been impossible to predict the genetic alterations underlying these adaptations from current biological knowledge. . ."
These bacteria are already way over to the side of all the life on Earth. But the next step would be to produce bacteria that have to live on chlorouracil and just ignore thymine. If that can be realized, the resulting organisms will be the first representatives of a new biology - no cellular life form has ever been discovered that completely switches out one of the DNA bases. These sorts of experiments open the door to organisms with expanded genetic codes, new and unnatural proteins and enzymes, and who knows what else besides. And they'll be essentially firewalled from all other living creatures.
Postscript: and yes, it's occurred to me as well that this sort of system would be a good way to evolve arsenate-using bacteria, if they do really exist. The problem (as it is with the current work) is getting truly phosphate-free media. But if you had such, and ran the experiment, I'd suggest isolating small samples along the way and starting them fresh in new apparatus, in order to keep the culture from living off the phosphate from previous generations. Trying to get rid of one organic molecule is hard enough; trying to clear out a whole element is a much harder proposition).
+ TrackBacks (0) | Category: Biological News | Chemical Biology | Life As We (Don't) Know It
October 6, 2010
I mentioned directed evolution of enzymes the other day as an example of chemical biology that’s really having an industrial impact. A recent paper in Science from groups at Merck and Codexis really highlights this. The story they tell had been presented at conferences, and had impressed plenty of listeners, so it’s good to have it all in print.
It centers on a reaction that’s used to produce the diabetes therapy Januvia (sitagliptin). There’s a key chiral amine in the molecule, which had been produced by asymmetric hydrogenation of an enamine. On scale, though, that’s not such a great reaction. Hydrogenation itself isn’t the biggest problem, although if you could ditch a pressurized hydrogen step for something that can’t explode, that would be a plus. No, the real problem was that the selectivity wasn’t quite what it should be, and the downstream material was contaminated with traces of rhodium from the catalyst.
So they looked at using a transaminase enzyme instead. That’s a good idea, because transaminases are one of those enzyme classes that do something that we organic chemists generally can’t usually do very well – in this case, change a ketone to a chiral amino group in one step. (It takes another amine and oxidizes that on the other side of the reaction). We’ve got chiral reductions of imines and enamines, true, but those almost always need a lot of fiddling around for catalysts and conditions (and, as in this case, can cause their own problems even when they work). And going straight to a primary amine can be, in any case, one of the more difficult transformations. Ammonia itself isn’t too reactive, and you don’t have much of a steric handle to work with.
But transaminases have their idiosyncracies (all enzymes do). They generally only will accept methyl ketones as substrates, and that’s what these folks found when they screened all the commercially available enzymes. Looking over the structure (well, a homology model of the structure) of one of these (ATA-117), which would be expected to give the right stereochemistry if it could be made to give anything whatsoever, gave some clues. There’s a large binding pocket on one side of the ketone, which still wasn’t quite large enough for the sitagliptin intermediate, and a small site on the other side, which definitely wasn’t going to take much more than a methyl group.
They went after the large binding pocket first. A less bulky version of the desired substrate (which had been turned, for now, into a methyl ketone) showed only 4% conversion with the starting enzymes. Mutating the various amino acids that looked important for large-pocket binding gave some hope. Changing a serine to phenylalanine, for example, cranked up the activity by 11-fold. The other four positions were, as the paper said, “subjected to saturation mutagenesis”, and they also produced a combinatorial library of 216 multi-mutant variations.
Therein lies a tale. Think about the numbers here: according to the supplementary material for the paper, they varied twelve residues in the large binding pocket, with (say) twenty amino acid possibilities per. So you’ve got 240 enzyme variants to make and test. Not fun, but it’s doable if you really want to. But if you’re going to cover all the multi-mutant space, that’s twenty to the 12th, or over four quadrillion enzyme candidates. That’s not going to happen with any technology that I can easily picture right now. And you’re going to want to sample this space, because enzyme amino acid residues most certainly do affect each other. Note, too, that we haven’t even discussed the small pocket, which is going to have to be mutated, too .
So there’s got to be some way to cut this problem down to size, and that (to my mind) is one of the things that Codexis is selling. They didn’t, for example, get a darn thing out of the single-point-mutation experiments. But one member of a library of 216 multi-mutant enzymes showed the first activity toward the real sitagliptin ketone precursor. This one had three changes in the small pocket and that one P-for-S in the large, and identifying where to start looking for these is truly the hard part. It appears to have been done through first ruling out the things that were least likely to work at any given residue, followed by an awful lot of computational docking.
It’s not like they had the Wonder Enzyme just yet, although just getting anything to happen at all must have been quite a reason to celebrate. If you loaded two grams/liter of ketone, and put in enzyme at 10 grams/liter (yep, ten grams per liter, holy cow), you got a whopping 0.7% conversion in 24 hours. But as tiny as that is, it’s a huge step up from flat zero.
Next up was a program of several rounds of directed evolution. All the variants that had shown something useful were taken through a round of changes at other residues, and the best of these combinations were taken on further. That statement, while true, gives you no feel at all for what this stuff is like, though. There are passages like this in the experimental details:
At this point in evolution, numerous library strategies were employed and as beneficial mutations were identified they were added into combinatorial libraries. The entire binding pocket was subjected to saturation mutagenesis in round 3. At position 69, mutations TAS and C were improved over G. This is interesting in two aspects. First, V69A was an option in the small pocket combinatorial library, but was less beneficial than V69G. Second, G69T was improved (and found to be the most beneficial in the next
round) suggesting that something other than sterics is involved at this position as it was a Val in the starting enzyme. At position 137, Thr was found to be preferred over Ile. Random mutagenesis generated two of the mutations in the round 3 variant: S8P and G215C. S8P was shown to increase expression and G215C is a surface exposed mutation which may be important for stability. Mutations identified from homologous enzymes identified M94I in the dimer interface as a beneficial mutation. In subsequent rounds of evolution the same library strategies were repeated and expanded. Saturation mutagenesis of the secondary sphere identified L61Y, also at the dimer interface, as being beneficial. The repeated saturation mutagenesis of 136 and 137 identified Y136F and T137E as being improved.
There, that wasn’t so easy, was it? This should give you some idea of what it’s like to engineer an enzyme, and what it’s like to go up against a billion years of random mutation. And that’s just the beginning – they ended up doing ten rounds of mutations, and had to backtrack some along the way when some things that looked good turned out to dead-end later on. Changes were taken on to further rounds not only on the basis of increased turnover, but for improved temperature and pH stability, tolerance to DMSO co-solvent, and so on. They ended up, over the entire process, screening a total of 36,480 variations, which is a hell of a lot, but is absolutely infinitesmal compared to the total number of possibilities. Narrowing that down to something feasible is, as I say, what Codexis is selling here.
And what came out the other end? Well, recall that the known enzymes all had zero activity, so it’s kind of hard to calculate improvement from that. Comparing to the first mutant that showed anything at all, they ended up with something that was about 27,000 times better. This has 27 mutations from the original known enzyme, so it’s a rather different beast. The final enzyme runs in DMSO/water, at loadings up of to 250g/liter of starting material at 3 weight per cent enzyme loading, and turns isopropylamine into acetone while it’s converting the prositagliptin ketone to product. It is completely stereoselective (they’ve never seen the other amine), and needless to say involves no hydrogen tanks and furnishes material that is not laced with rhodium metal.
This is impressive stuff. You'll note, though, the rather large amount of grunt work that had to go into it, although keep in mind, the potential amount of grunt work would be more than the output of the entire human race. To date. Just for laughs, an exhaustive mutational analysis of twenty-seven positions would give you 1.3 times ten to the thirty-fifth possibilities to screen, and that's if you know already which twenty-seven positions you're going to want to look at. One microgram of each of them would give you the mass of about a hundred Earths, not counting the vials. Not happening.
Also note that this is the sort of thing that would only be done industrially, in an applied research project. Think about it: why else would anyone go to this amount of trouble? The principle would have been proven a lot earlier in the process, and the improvements even part of the way through still would have been startling enough to get your work published in any journal in the world and all your grants renewed. Academically, you'd have to be out of your mind to carry things to this extreme. But Merck needs to make sitagliptin, and needs a better way to do that, and is willing to pay a lot of money to accomplish that goal. This is the kind of research that can get done in this industry. More of this, please!
+ TrackBacks (0) | Category: Biological News | Chemical Biology | Chemical News | Drug Development
October 5, 2010
Here's an interesting example of a way that synthetic chemistry is creeping into the provinces of molecular biology. There have been a lot of interesting ideas over the years around the idea of polymers made to recognize other molecules. These appear in the literature as "molecularly imprinted polymers", among other names, and have found some uses, although it's still something of a black art. A group at Cal-Irvine has produced something that might move the field forward significantly, though.
In 2008, they reported that they'd made polymer particles that recognized the bee-sting protein melittin. Several combinations of monomers were looked at, and the best seemed to be a crosslinked copolymer with both acrylic acid and an N-alkylacrylamide (giving you both polar and hydrophobic possibilities). But despite some good binding behavior, there are limits to what these polymers can do. They seem to be selective for melittin, but they can't pull it out of straight water, which is a pretty stringent test. (If you can compete with the hydrogen-bonding network of bulk water that's holding the hydrophilic parts of your target, as opposed to relying on just the hydrophobic interactions with the other parts, you've got something impressive).
Another problem, which is shared by all polymer-recognition ideas, is that the materials you produce aren't very well defined. You're polymerizing a load of monomers in the presence of your target molecule, and they can (and will) link up in all sorts of ways. So there are plenty of different binding sites on the particles that get produced, with all sorts of affinities. How do you sort things out?
Now the Irvine group has extended their idea, and found some clever ways around these problems. The first is to use good old affinity chromatography to clean up the mixed pile of polymer nanoparticles that you get at first. Immobilizing melittin onto agarose beads and running the nanoparticles over them washes out the ones with lousy affinity - they don't hold up on the column. (Still, they had to do this under fairly high-salt conditions, since trying this in plain water didn't allow much of anything to stick at all). Washing the column at this point with plain water releases a load of particles that do a noticeably better job of recognizing melittin in buffer solutions.
The key part is coming up, though. The polymer particles they've made show a temperature-dependent change in structure. At RT, they're collapsed polymer bundles, but in the cold, they tend to open up and swell with solvent. As it happens, that process makes them lose their melittin-recognizing abilities. Incubating the bound nanoparticles in ice-cold water seems to only release the ones that were using their specific melittin-binding sites (as opposed to more nonspecific interactions with the agarose and the like). The particles eluted in the cold turned out to be the best of all: they show single-digit nanomolar affinity even in water! They're only a few per cent of the total, but they're the elite.
Now several questions arise: how general is this technique? That is, is melittin an outlier as a peptide, with structural features that make it easy to recognize? If it's general, then how small can a recognition target be? After all, enzymes and receptors can do well with ridiculously small molecules: can we approach that? It could be that it can't be done with such a simple polymer system - but if more complex ones can also be run through such temperature-transition purification cycles, then all sorts of things might be realized. More questions: What if you do the initial polymerization in weird solvents or mixtures? Can you make receptor-blocking "caps" out of these things if you use overexpressed membranes as the templates? If you can get the particles to the right size, what would happen to them in vivo? There are a lot of possibilities. . .
+ TrackBacks (0) | Category: Analytical Chemistry | Chemical Biology | Chemical News | Drug Assays