Hang around a bunch of medicinal chemists (no, really, it's more fun than you'd think) and you're bound to hear discussion of cLogP. For the chemists in the crowd, I should warn you that I'm about to say nasty things about it.
For the nonchemists in the crowd, logP is a measure of how greasy (or how polar) a compound is. It's based on a partition experiment: shake up a measured amount of a compound with defined volumes of water and n-octanol, a rather greasy solvent which I've never seen referred to in any other experimental technique. Then measure how much of the compound ends up in each layer, and take the log of the octanol/water ratio. So if a thousand times as much compound goes into the octanol as goes into the water (which for drug substances is quite common, in fact, pretty good), then the logP is 3. The reason we care about this is that really greasy compounds (and one can go up to 4, 5, 6, and possibly beyond), have problems. They tend to dissolve poorly in the gut, have problems crossing membranes in living systems, get metabolized extensively in the liver, and stick to a lot of proteins that you'd rather they didn't stick to. Fewer high-logP compounds are capable of making it as drugs.
So far, so good. But there are complications. For one thing, that description above ignores the pH of the water solution, and for charged compounds that's a big factor. logD is the term for the distribution of all species (ionized or not), and logD at pH 7.4 (physiological) is a valuable measurement if you've got the possibility of a charged species (and plenty of drug molecules do, thanks to basic amines, carboxylic acids, etc.) But there are bigger problems.
You'll notice that the experiment outlined in the second paragraph could fairly be described as tedious. In fact, I have never seen it performed. Not once, and I'll bet that the majority of medicinal chemists never have, either. And it's not like it's just being done out of my sight; there's no roomful of automated octanol/water extraction machines clanking away in the basement. I should note that there are other higher-throughput experimental techniques (such as HPLC retention times) that also correlate with logP and have been used to generate real numbers, but even those don't account for the great majority of the numbers that we talk about all the time. So how do we manage to do that?
It has to do with a sleight of hand I've performed while writing the above sections, which some of you have probably already noticed. Most of the time, when we talk about logP values in early drug discovery, we're talking about cLogp. That "c" stands for calculated. There are several programs that estimate logP based on known values for different rings and functional groups, and with different algorithms for combining and interpolating them. In my experience, almost all logP numbers that get thrown around are from these tools; no octanol is involved.
And sometimes that worries me a bit. Not all of these programs will tell you how solid those estimates are. And even if they will, not all chemists will bother to check. If your structure is quite close to something that's been measured, then fine, the estimate is bound to be pretty good. But what if you feed in a heterocycle that's not in the lookup table? The program will spit out a number, that's what. But it may not be a very good number, even if it goes out to two decimal places. I can't even remember when I might have last seen a cLogP value with a range on it, or any other suggestion that it might be a bit fuzzy.
There are more subtle problems, too - I've seen some oddities with substitutions on saturated heterocyclic rings (morpholine, etc.) that didn't quite seem to make sense. Many chemists get these numbers, look at them quizzically, and say "Hmm, I didn't know that those things sorted out like that. Live and learn!" In other words, they take the calculated values as reality. I've even had people defend these numbers by explaining to me patiently that these are, after all, calculated logP values, and the calculated log P values rank-order like so, and what exactly is my problem? And while it's hard to argue with that, we are not putting our compounds into the simulated stomachs of rationalized rodents. Real-world decisions can be made based on numbers that do not come from the real world.