I've linked to some very skeptical takes on the ENCODE project, the effort that supposedly identified 80% of our DNA sequence as functional to some degree. I should present some evidence for the other side, though, as it comes up, and some may have come up.
Two recent papers in Cell tell the story. The first proposes "super-enhancers" as regulators of gene transcription. (Here's a brief summary of both). These are clusters of known enhancer sequences, which seem to recruit piles of transcription factors, and act differently from the single-enhancer model. The authors show evidence that these are involved in cell differentiation, and could well provide one of the key systems for determining eventual cellular identity from pluripotent stem cells.
Interest in further understanding the importance of Mediator in ESCs led us to further investigate enhancers bound by the master transcription factors and Mediator in these cells. We found that much of enhancer-associated Mediator occupies exceptionally large enhancer domains and that these domains are associated with genes that play prominent roles in ESC biology. These large domains, or super-enhancers, were found to contain high levels of the key ESC transcription factors Oct4, Sox2, Nanog, Klf4, and Esrrb to stimulate higher transcriptional activity than typical enhancers and to be exceptionally sensitive to reduced levels of Mediator. Super-enhancers were found in a wide variety of differentiated cell types, again associated with key cell-type-specific genes known to play prominent roles in control of their gene expression program
On one level, this is quite interesting, because cellular differentiation is a process that we really need to know a lot more about (the medical applications are enormous). But as a medicinal chemist, this sort of news sort of makes me purse my lips, because we have enough trouble dealing with the good old fashioned transcription factors (whose complexes of proteins were already large enough, thank you). What role there might be for therapeutic intervention in these super-complexes, I couldn't say.
The second paper has more on this concept. They find that these "super-enhancers" are also important in tumor cells (which would make perfect sense), and that they tie into two other big stories in the field, the epigenetic regulator BRD4 and the multifunctional protein cMyc:
Here, we investigate how inhibition of the widely expressed transcriptional coactivator BRD4 leads to selective inhibition of the MYC oncogene in multiple myeloma (MM). BRD4 and Mediator were found to co-occupy thousands of enhancers associated with active genes. They also co-occupied a small set of exceptionally large super-enhancers associated with genes that feature prominently in MM biology, including the MYC oncogene. Treatment of MM tumor cells with the BET-bromodomain inhibitor JQ1 led to preferential loss of BRD4 at super-enhancers and consequent transcription elongation defects that preferentially impacted genes with super-enhancers, including MYC. Super-enhancers were found at key oncogenic drivers in many other tumor cells.
About 3% of the enhancers found in the multiple myeloma cell line turned out to be tenfold-larger super-enhancer complexes, which bring in about ten times as much BRD4. It's been recently discovered that small-molecule ligands for BRD4 have a large effect on the cMyc pathway, and now we may know one of the ways that happens. So that might be part of the answer to the question I posed above: how do you target these things with drugs? Find one of the proteins that it has to recruit in large numbers, and mess up its activity at a small-molecule binding site. And if these giant complexes are even more sensitive to disruptions in these key proteins than usual (as the paper hypothesizes), then so much the better.
It's fortunate that chromatin-remodeling proteins such as BRD4 are (at least in some cases) filling that role, because they have pretty well-defining binding pockets that we can target. Direct targeting of cMyc, by contrast, has been quite difficult indeed (here's a new paper with some background on what's been accomplished so far).
Now, to the level of my cell biology expertise, the evidence that these papers have looks reasonably good. I'm certainly willing to believe that there are levels of transcriptional control beyond those that we've realized so far, weary sighs of a chemist aside. But I'll be interested to see the arguments over this concept play out. For example, if these very long stretches of DNA turn out indeed to be so important, how sensitive are they to mutation? One of the key objections to the ENCODE consortium's interpretation of their data is that much of what they're calling "functional" DNA seems to have little trouble drifting along and picking up random mutations. It will be worth applying this analysis to these super-regulators, but I haven't seen that done yet.