I mentioned directed evolution of enzymes the other day as an example of chemical biology that’s really having an industrial impact. A recent paper in Science from groups at Merck and Codexis really highlights this. The story they tell had been presented at conferences, and had impressed plenty of listeners, so it’s good to have it all in print.
It centers on a reaction that’s used to produce the diabetes therapy Januvia (sitagliptin). There’s a key chiral amine in the molecule, which had been produced by asymmetric hydrogenation of an enamine. On scale, though, that’s not such a great reaction. Hydrogenation itself isn’t the biggest problem, although if you could ditch a pressurized hydrogen step for something that can’t explode, that would be a plus. No, the real problem was that the selectivity wasn’t quite what it should be, and the downstream material was contaminated with traces of rhodium from the catalyst.
So they looked at using a transaminase enzyme instead. That’s a good idea, because transaminases are one of those enzyme classes that do something that we organic chemists generally can’t usually do very well – in this case, change a ketone to a chiral amino group in one step. (It takes another amine and oxidizes that on the other side of the reaction). We’ve got chiral reductions of imines and enamines, true, but those almost always need a lot of fiddling around for catalysts and conditions (and, as in this case, can cause their own problems even when they work). And going straight to a primary amine can be, in any case, one of the more difficult transformations. Ammonia itself isn’t too reactive, and you don’t have much of a steric handle to work with.
But transaminases have their idiosyncracies (all enzymes do). They generally only will accept methyl ketones as substrates, and that’s what these folks found when they screened all the commercially available enzymes. Looking over the structure (well, a homology model of the structure) of one of these (ATA-117), which would be expected to give the right stereochemistry if it could be made to give anything whatsoever, gave some clues. There’s a large binding pocket on one side of the ketone, which still wasn’t quite large enough for the sitagliptin intermediate, and a small site on the other side, which definitely wasn’t going to take much more than a methyl group.
They went after the large binding pocket first. A less bulky version of the desired substrate (which had been turned, for now, into a methyl ketone) showed only 4% conversion with the starting enzymes. Mutating the various amino acids that looked important for large-pocket binding gave some hope. Changing a serine to phenylalanine, for example, cranked up the activity by 11-fold. The other four positions were, as the paper said, “subjected to saturation mutagenesis”, and they also produced a combinatorial library of 216 multi-mutant variations.
Therein lies a tale. Think about the numbers here: according to the supplementary material for the paper, they varied twelve residues in the large binding pocket, with (say) twenty amino acid possibilities per. So you’ve got 240 enzyme variants to make and test. Not fun, but it’s doable if you really want to. But if you’re going to cover all the multi-mutant space, that’s twenty to the 12th, or over four quadrillion enzyme candidates. That’s not going to happen with any technology that I can easily picture right now. And you’re going to want to sample this space, because enzyme amino acid residues most certainly do affect each other. Note, too, that we haven’t even discussed the small pocket, which is going to have to be mutated, too .
So there’s got to be some way to cut this problem down to size, and that (to my mind) is one of the things that Codexis is selling. They didn’t, for example, get a darn thing out of the single-point-mutation experiments. But one member of a library of 216 multi-mutant enzymes showed the first activity toward the real sitagliptin ketone precursor. This one had three changes in the small pocket and that one P-for-S in the large, and identifying where to start looking for these is truly the hard part. It appears to have been done through first ruling out the things that were least likely to work at any given residue, followed by an awful lot of computational docking.
It’s not like they had the Wonder Enzyme just yet, although just getting anything to happen at all must have been quite a reason to celebrate. If you loaded two grams/liter of ketone, and put in enzyme at 10 grams/liter (yep, ten grams per liter, holy cow), you got a whopping 0.7% conversion in 24 hours. But as tiny as that is, it’s a huge step up from flat zero.
Next up was a program of several rounds of directed evolution. All the variants that had shown something useful were taken through a round of changes at other residues, and the best of these combinations were taken on further. That statement, while true, gives you no feel at all for what this stuff is like, though. There are passages like this in the experimental details:
At this point in evolution, numerous library strategies were employed and as beneficial mutations were identified they were added into combinatorial libraries. The entire binding pocket was subjected to saturation mutagenesis in round 3. At position 69, mutations TAS and C were improved over G. This is interesting in two aspects. First, V69A was an option in the small pocket combinatorial library, but was less beneficial than V69G. Second, G69T was improved (and found to be the most beneficial in the next
round) suggesting that something other than sterics is involved at this position as it was a Val in the starting enzyme. At position 137, Thr was found to be preferred over Ile. Random mutagenesis generated two of the mutations in the round 3 variant: S8P and G215C. S8P was shown to increase expression and G215C is a surface exposed mutation which may be important for stability. Mutations identified from homologous enzymes identified M94I in the dimer interface as a beneficial mutation. In subsequent rounds of evolution the same library strategies were repeated and expanded. Saturation mutagenesis of the secondary sphere identified L61Y, also at the dimer interface, as being beneficial. The repeated saturation mutagenesis of 136 and 137 identified Y136F and T137E as being improved.
There, that wasn’t so easy, was it? This should give you some idea of what it’s like to engineer an enzyme, and what it’s like to go up against a billion years of random mutation. And that’s just the beginning – they ended up doing ten rounds of mutations, and had to backtrack some along the way when some things that looked good turned out to dead-end later on. Changes were taken on to further rounds not only on the basis of increased turnover, but for improved temperature and pH stability, tolerance to DMSO co-solvent, and so on. They ended up, over the entire process, screening a total of 36,480 variations, which is a hell of a lot, but is absolutely infinitesmal compared to the total number of possibilities. Narrowing that down to something feasible is, as I say, what Codexis is selling here.
And what came out the other end? Well, recall that the known enzymes all had zero activity, so it’s kind of hard to calculate improvement from that. Comparing to the first mutant that showed anything at all, they ended up with something that was about 27,000 times better. This has 27 mutations from the original known enzyme, so it’s a rather different beast. The final enzyme runs in DMSO/water, at loadings up of to 250g/liter of starting material at 3 weight per cent enzyme loading, and turns isopropylamine into acetone while it’s converting the prositagliptin ketone to product. It is completely stereoselective (they’ve never seen the other amine), and needless to say involves no hydrogen tanks and furnishes material that is not laced with rhodium metal.
This is impressive stuff. You'll note, though, the rather large amount of grunt work that had to go into it, although keep in mind, the potential amount of grunt work would be more than the output of the entire human race. To date. Just for laughs, an exhaustive mutational analysis of twenty-seven positions would give you 1.3 times ten to the thirty-fifth possibilities to screen, and that's if you know already which twenty-seven positions you're going to want to look at. One microgram of each of them would give you the mass of about a hundred Earths, not counting the vials. Not happening.
Also note that this is the sort of thing that would only be done industrially, in an applied research project. Think about it: why else would anyone go to this amount of trouble? The principle would have been proven a lot earlier in the process, and the improvements even part of the way through still would have been startling enough to get your work published in any journal in the world and all your grants renewed. Academically, you'd have to be out of your mind to carry things to this extreme. But Merck needs to make sitagliptin, and needs a better way to do that, and is willing to pay a lot of money to accomplish that goal. This is the kind of research that can get done in this industry. More of this, please!