Update: The post has been slightly edited to (hopefully) clarify that non-coding DNA does not equal junk DNA, and that care has to be taken in interpreting research. (Hence the question mark in the title…).
An estimated 1-3% of our DNA consists out of genes that code for proteins. The rest, so it was once thought, has no direct functional effect on the organism itself and is basically junk. Hence, it became known under the name ‘junk DNA’. But, it has become clear that this non-coding DNA is not all that ‘junky’ (a lot of non-coding DNA turns out to have a function. Also, keep in mind that biological function is a difficult concept). A few days ago, the ENCODE (Encyclopedia of DNA Elements) consortium has released 30 papers that support the notion that a lot is going on in regions of non-coding DNA.
The consortium, consisting out of 442 members has spent nine years scrutinizing the human genome. For some more numbers, divert your gaze slightly to the right.
To read a bit more about how such a consortium is born and functions, check out The making of ENCODE in Nature.
You can also check an interview with Ewan Birney, who led the analysis performed by ENCODE in Scientific American.
Nature has also developed an interactive ENCODE explorer, where you can learn more about the research.
So, what were the main findings of this huge, international effort?
- Over 80% of the human genome plays a part in at least one biochemical event in at least one cell type.
- Many of these elements show evidence of selection and are thus expected to be functional.
- Much of the variation in RNA expression can be explained through promotor functionality.
- The functional regions, as identified by ENCODE, are often found in regions of non-coding DNA, at least as often as in coding stretches of DNA.
- A lot of the functional elements located in non-coding DNA seem to be implicated in a wide variety of diseases.
Or, you can take a couple of minutes to watch the video:
Of course, there is a lot more to be done, as you can see when browsing through the Experiment Matrix at the ENCODE website of the University of California, Santa Cruz.
An oversight of how the media dealt with the ENCODE news and often misinterpreted it can be found here. A post that, much more eloquently than I can, elucidates what the project actually says is here.
Much more cell and experiment types await before we can truly read the book of life. But, it seems we’re beginning to learn its language.
The ENCODE Project Consortium. (2012). An integrated encyclopedia of DNA elements in the human genome. Nature, 489, 57-74 DOI: 10.1038/nature11247