Corante

About this Author
DBL%20Hendrix%20small.png College chemistry, 1983

Derek Lowe The 2002 Model

Dbl%20new%20portrait%20B%26W.png After 10 years of blogging. . .

Derek Lowe, an Arkansan by birth, got his BA from Hendrix College and his PhD in organic chemistry from Duke before spending time in Germany on a Humboldt Fellowship on his post-doc. He's worked for several major pharmaceutical companies since 1989 on drug discovery projects against schizophrenia, Alzheimer's, diabetes, osteoporosis and other diseases. To contact Derek email him directly: derekb.lowe@gmail.com Twitter: Dereklowe

Chemistry and Drug Data: Drugbank
Emolecules
ChemSpider
Chempedia Lab
Synthetic Pages
Organic Chemistry Portal
PubChem
Not Voodoo
DailyMed
Druglib
Clinicaltrials.gov

Chemistry and Pharma Blogs:
Org Prep Daily
The Haystack
Kilomentor
A New Merck, Reviewed
Liberal Arts Chemistry
Electron Pusher
All Things Metathesis
C&E News Blogs
Chemiotics II
Chemical Space
Noel O'Blog
In Vivo Blog
Terra Sigilatta
BBSRC/Douglas Kell
ChemBark
Realizations in Biostatistics
Chemjobber
Pharmalot
ChemSpider Blog
Pharmagossip
Med-Chemist
Organic Chem - Education & Industry
Pharma Strategy Blog
No Name No Slogan
Practical Fragments
SimBioSys
The Curious Wavefunction
Natural Product Man
Fragment Literature
Chemistry World Blog
Synthetic Nature
Chemistry Blog
Synthesizing Ideas
Business|Bytes|Genes|Molecules
Eye on FDA
Chemical Forums
Depth-First
Symyx Blog
Sceptical Chymist
Lamentations on Chemistry
Computational Organic Chemistry
Mining Drugs
Henry Rzepa


Science Blogs and News:
Bad Science
The Loom
Uncertain Principles
Fierce Biotech
Blogs for Industry
Omics! Omics!
Young Female Scientist
Notional Slurry
Nobel Intent
SciTech Daily
Science Blog
FuturePundit
Aetiology
Gene Expression (I)
Gene Expression (II)
Sciencebase
Pharyngula
Adventures in Ethics and Science
Transterrestrial Musings
Slashdot Science
Cosmic Variance
Biology News Net


Medical Blogs
DB's Medical Rants
Science-Based Medicine
GruntDoc
Respectful Insolence
Diabetes Mine


Economics and Business
Marginal Revolution
The Volokh Conspiracy
Knowledge Problem


Politics / Current Events
Virginia Postrel
Instapundit
Belmont Club
Mickey Kaus


Belles Lettres
Uncouth Reflections
Arts and Letters Daily
In the Pipeline: Don't miss Derek Lowe's excellent commentary on drug discovery and the pharma industry in general at In the Pipeline

In the Pipeline

« China's GlaxoSmithKline Crackdown | Main | BMS Moving Jobs to Florida »

July 18, 2013

The Junk DNA Wars Get Hotter

Email This Entry

Posted by Derek

Thanks to an alert reader, I was put on to this paper in PNAS. It's from a team at Washington U. in St. Louis, and my fellow Cardinals fans are definitely stirring things up in the debate over "junk DNA" function and the ENCODE results. (The most recent post here on the debate covered the "It's functional" point of view - for links to previous posts on some vigorous ENCODE-bashing publications, see here).

This new paper, blogged about here at Homologus and here by one of its authors, Mike White, is an attempt to run a null-hypothesis experiment on transcription factor function. There are a lot of transcription factor recognition sequences in the genome. They're short DNA sequences that serve as flags for the whole transcription machinery to land and start assembling at a particular spot. Transcription factors themselves are the proteins that do the primary recognition of these sequences, and that gives them plenty to do. With so many DNA motifs out there (and so many near-misses), some of their apparent targets are important and real and some of them may well be noise. TFs have their work cut out.

What this new paper did was look at a particular transcription factor, Crx. They took a set of 1,300 sequences that are (functionally) known to bind it - 865 of them with the canonical recognition motifs and 433 of them that are known to bind, but don't have the traditional motif. They compared that set to 3,000 control sequences, including 865 of them "specifically chosen to match the Crx motif content and chromosomal distribution" as compared to that first set. They also included a set of single-point mutations of the known binding sequences, along with sets of scrambled versions of both the known binding regions and the matched controls above, with dinucleotide ratios held constant - random but similar.

What they found, first, was that the known binding elements do indeed drive transcription, as advertised, while the controls don't. But the ENCODE camp has a broader definition of function than just this, and here's where the dinucleotides hit the fan. When they looked at gene repression activity, they found that the 865 binders and the 865 matched controls (with Crx recognition elements, but in unbound regions of the genome) both showed similar amounts of activity. As the paper says, "Overall, our results show that both bound and unbound Crx motifs, removed from their genomic context, can produce repression, whereas only bound regions can strongly activate".

So far, so good, and nothing that the ENCODE people might disagree with - I mean, there you are, unbound regions of the genome showing functional behavior and all. But the problem is, most of the 1,300 random sequences also showed regulatory effects:

Our results demonstrate the importance of comparing the activity of candidate CREs (cis-regulatory elements - DBL) against distributions of control sequences, as well as the value of using multiple approaches to assess the function of CREs. Although scrambled DNA elements are unlikely to drive very strong levels of activation or repression, such sequences can produce distinct levels of enhancer activity within an intermediate range that overlaps with the activity of many functional sequences. Thus, function cannot be assessed solely by applying a threshold level of activity; additional approaches to characterize function are necessary, such as mutagenesis of TF binding sites.

In other words, to put it more bluntly than the paper does, one could generate ENCODE-like levels of functionality with nothing but random DNA. These results will not calm anyone down, but it's not time to calm down just yet. There are some important issues to be decided here - from theoretical biology all the way down to how many drug targets we can expect to have. I look forward to the responses to this work. Responses will most definitely be forthcoming.

Comments (12) + TrackBacks (0) | Category: Biological News


COMMENTS

1. Rhenium on July 18, 2013 11:56 AM writes...

I'll go get the popcorn... :)

Permalink to Comment

2. ESIMS on July 18, 2013 3:35 PM writes...

TBH I'm generally reserved when it comes to ChIP-seq papers, because of all of the things that can go wrong (HTS is sensitive as hell) or interfere with the data interpretation.

But this is quite an artificial set-up with the introduction of plasmids ("wrong" transcription factor to DNA ratio per cell), so I wouldn't compare those results to any results obtained from a normal tissue sample. And to conclude something about chromatin structure is similarly bold.

GC related effect: methylation?

Permalink to Comment

3. sbierwagen on July 18, 2013 10:47 PM writes...

Whenever one of your posts has a quoted section in it, all the paragraph text after is rendered wrong in my RSS reader.

Looking at the page source, the problem seems to be <p><i><blockquote> Text... </i></blockquote></p>

Some parsers are sensitive to the order of tags. They want the block-level tag first (blockquote), then the paragraph tag, then the italics tag; closing the tags in reverse order.

Permalink to Comment

4. homolog.us on July 19, 2013 1:40 PM writes...

You may also check another study along the same line that came out in Genome Biology. I blogged about it here -

http://www.homolog.us/blogs/blog/2013/07/18/from-de-bruijn-graph-to-minimal-promoter/

At first I thought its results were in conflict with Mike White's, but others brought me to my senses :)

Permalink to Comment

5. Anonymous on July 21, 2013 6:58 PM writes...

"Transcription factors themselves are the proteins that do the primary recognition of these sequences"

Or on the contrary, *these sequences* do the primary work recognizing the specific proteins that find themselves inside the nucleus!

Permalink to Comment

6. s on July 24, 2013 5:26 PM writes...

It might be useful to mention the organism you are talking about...

Permalink to Comment

7. Salah on July 25, 2013 10:42 AM writes...

As a computer scientist (and ferocious fan of genetics) interested by programming languages and compilers, I can't but say that what is called 'junk DNA' is no junk at all.
It follows a certain pattern that can be summarized in a grammar. That's how we create programming languages are created, and the genetic code is nothing but a programming language which key words are bases & codons. The only missing link to how all this fits to place is the grammar. And my theory is what is called 'junk DNA' is somehow an aggregation of grammatical rules or a compiler that helps translate the remaining coding parts.
This idea doesn't come from nowhere, nature is full of repetitive patterns that can be expressed using grammars. Fractals are used in the gaming industry to create landscapes, L-Systems to generate plants and trees, the human language itself follows a grammar. So why not the genetic code ?!

Permalink to Comment

8. THEMAYAN on August 10, 2013 9:05 AM writes...

I believe the amount of functional ncDNA is being discovered all the time. And there are now multiple independent papers confirming this, and not just the findings of ENCODE. I also think that it is not unreasonable to use a broad definition of function when the findings are themself so broad. Sometimes it is what it is.

I think the hard working men and women of ENCODE and especially Ewan Birney, have been treated unfairly, and that this has a lot to do with the culture war between neo Darwinism selection vs intelligent design. As useless junk DNA was used as a poster child for bad design and for many years by these same culture warriors. Yes, there were some brave few who questioned the junk paradigm years ago, but like Barbara Mcclintock's work on transposons in the fifties, it was largely ignored by the status quo. However these finding are not being ignored.

Mattic himself is not a proponent of intelligent design, but even he comments on this aspects of the criticism.
I personally see no end to this train in any near future, and I think only time will tell the true empirical nature of these findings.


Here below is a comprehensive and recent paper that is critical of Dan Graur's criticism of ENCODE, which in my opinion seemed more like a hit piece than a professional scientific critique. And again I'm speaking of Gruar's critique of ENCODE.


The extent of functionality in the human genome
John S Mattick1,2,3* and Marcel E Dinger
Springer 2013

Permalink to Comment

9. THEMAYAN on August 10, 2013 9:07 AM writes...

I believe the amount of functional ncDNA is being discovered all the time. And there are now multiple independent papers confirming this, and not just the findings of ENCODE. I also think that it is not unreasonable to use a broad definition of function when the findings are themself so broad. Sometimes it is what it is.

I think the hard working men and women of ENCODE and especially Ewan Birney, have been treated unfairly, and that this has a lot to do with the culture war between neo Darwinism selection vs intelligent design. As useless junk DNA was used as a poster child for bad design and for many years by these same culture warriors. Yes, there were some brave few who questioned the junk paradigm years ago, but like Barbara Mcclintock's work on transposons in the fifties, it was largely ignored by the status quo. However these finding are not being ignored.

Mattic himself is not a proponent of intelligent design, but even he comments on this aspects of the criticism.
I personally see no end to this train in any near future, and I think only time will tell the true empirical nature of these findings.


Here below is a comprehensive and recent paper that is critical of Dan Graur's criticism of ENCODE, which in my opinion seemed more like a hit piece than a professional scientific critique. And again I'm speaking of Gruar's critique of ENCODE.


The extent of functionality in the human genome
John S Mattick1,2,3* and Marcel E Dinger
Springer 2013

Permalink to Comment

10. THEMAYAN on August 10, 2013 9:07 AM writes...

I believe the amount of functional ncDNA is being discovered all the time. And there are now multiple independent papers confirming this, and not just the findings of ENCODE. I also think that it is not unreasonable to use a broad definition of function when the findings are themself so broad. Sometimes it is what it is.

Permalink to Comment

11. THEMAYAN on August 10, 2013 9:10 AM writes...

Sorry for duplicate post.

Permalink to Comment

12. allan on February 3, 2014 11:33 AM writes...

Themayan, I suspect you are, like me, no expert in these matters. Matick seems to be an outlier in the academic community. He has previously tried to show that the human genome is uniquely complex. He's in a minority of about one on this (among serious academics). This seems at odds with his other highly publicised views on 'non coding' DNA. Some amphibians have genomes 30x human. See here for a discussion on his Graur riposte. The comments are well worth reading. http://tinyurl.com/qx74plp

Permalink to Comment

POST A COMMENT




Remember Me?



EMAIL THIS ENTRY TO A FRIEND

Email this entry to:

Your email address:

Message (optional):




RELATED ENTRIES
Update on Alnylam (And the Direction of Things to Come)
There Must Have Been Multiple Chances to Catch This
Weirdly, Tramadol Is Not a Natural Product After All
Thiola, Retrophin, Martin Shkrell, Reddit, and More
The Most Unconscionable Drug Price Hike I Have Yet Seen
Clinical Trial Fraud
Grinding Up Your Reactions
Peer Review, Up Close and Personal