I mentioned in passing that getting cells to express a new gene's protein is voodoo. That's pretty close to the technical term of the art for it. Gene therapy is the high-profile application of the technique, but it's the bread and butter of molecular biology. I can tell you that a lot of drug company research would come to a lurching halt without cloned proteins.
One of my "Laws of the Lab" (more of those are on the way) is "When there are twenty ways in the literature to do something, that means that there is no good way to do it." This applies, in truckloads, to protein expression. When you consider the number of different cell lines you could try and the number of vectors you can use to get your DNA into them, you're already faced with plenty of choices.
Then you have the various things you can add to the main DNA sequence to try to get things to work, once they're in the cell. A good deal of protein-expression time is spent trying out different promoters, sequences that flank the one you're interested in that serve to lure the transcription machinery over to it. "Pick me! I'm important! You need lots of me, and you need it now! And get it right, will you?" is the sort of message these sequences are intended to send. After all, it does no good to insert DNA that doesn't get read at a useful level.
You generally want as much protein production as you can get (although it's possible overdo it, in which case you might start getting insoluble clumps as the stuff piles up inside the cells.) There are all sorts of systems to use that will give you a sign of whether or not your protein is actually being expressed, many of them quite ingenious. A common theme is to put in a marker gene along with your own, and many times that's a gene that confers an unusual resistance to some antibiotic. You transfect in your DNA, grow the cells, and hit them with the chemical agent - anything that survives has a good chance of having taken up and expressed the DNA you gave it (although you'd better check it by another route to make sure you aren't getting fooled.)
The big question, though, once you're sure that the stuff is being made, is whether it's coming out in a form that you can purify, and whether it's actually working the way it's supposed to. Those two questions are often tangled up in a knot - if your protein isn't clean, maybe that's why it isn't working. (Unfortunately, there are many times when your protein looks clean and still does the equivalent of floating belly-up to the top of the tank. Keeps everyone on their toes.)
There are plenty of separation techniques to fish your proteins out of cells. Most of them involve first breaking the cell walls and separating out the larger chunks like the nuclei (which you usually don't want,) the empty cell membrane (which you might, if you know your target protein lives there) and the cytoplasmic contents. Centrifugation is the standard way to do all this. There's a fair amount of gunk in the cytoplasm that you can "spin down," too (like the endoplasmic reticulum - to a cell biologist, "ER" isn't primarily the name of a TV show.)
Once you finish that, you're left with a mixture of thousands of different protein, carbohydrate, and lipid components. This is when you'll be glad that you overexpressed your target, because you'll need all the help you can get to make it stand out from the rest of the stew. Sometimes expression levels are high enough where you can do some minimal cleanup and use the stuff as is.
If that's not the case, a popular trick called "His-tagging" might be the answer. You set things up so that the proteins ends up with a run of histidine amino acids at one end (which you hope is far from the action that you care about.) All those imidazole side chains in a row will coordinate with metals, so you can pass the crude brew over a metal-containing column, wash all the non-histidine-rich stuff out, and then change to a stronger solvent to wash off your desired stuff. It's usually a safe bet that you won't have many natural proteins in there with a dozen histidines in a row - if there are such beasts, I'd like to know what the heck they do.
Another wild card is all the processing that proteins undergo in the cell - folding (sometimes with the assistance of other chaperone proteins,) phosphorylation, glycosidation and so on. As I understand it, if you run into trouble at this stage, you're pretty well sunk. Sometimes you can rescue things a bit by harvesting the protein at a different stage in the life cycle of the cells, but it's usually time to look for another cell type to start over with. This can happen especially if you go too far afield, phylogenetically. Bacteria, for example, are notorious for producing hopelessly hosed versions of some human proteins - although if you can engineer them just the right way, they can be tremendous producers. And (unlike tissue cultures) they're equipped to live on their own and take care of themselves. Yeast fall into that category, too - robust in their way, studied out the wazoo, but not always reliable for getting active protein.
Insect cells, though, are pretty good. A particular strain called SF-9 from the armyworm moth is widely used, in combination with a baculovirus vector, which has the advantage of not being infectious in humans. It doesn't always work, especially with proteins whose glycosylation pattern is critical, but it's one of the first things to try.
Mammalian (or even human) cells are a better bet to produce active protein, but they're often trickier to work with. A favorite line are the beloved CHO (Chinese Hamster Ovary) cells, which are often used when the expressed protein needs to be part of a living cell system for the assay. There are other common lines derived from mice and macaque monkeys. Moving to us, there are standard cell lines from human kidney or liver, and then there are the famous HeLa cells, one of many from human tumor sources. You don't see these types used as much for large-scale transfection/overexpression, though, because they necessarily give huge levels of protein, and any vector that can infect them can infect you, too.
So, getting things to work just the way you want them to involves manipulating a lot of variables, not all of which are well understood. Not to mention the ones about which we don't understand squat. Still, these experiments are a regular feature of any molecular biologist's life, and in a drug company we expect a pretty good success rate at eventually getting the proteins we want. It just depends on how much time and effort you want to throw at the problem - and what problem doesn't at least partially depend on that?
(For those who want more, here's a useful guide from one of the big commercial players in the area. It goes into details that I've skipped over, like various funky ways to get your DNA into the cells, and some fine points of how to grow them and keep them happy.)